基于GMM-UBM的声纹识别技术的特征参数研究  被引量:16

Research of Feature Parameters in Voiceprint Recognition Technology Based on GMM-UBM

在线阅读下载全文

作  者:周玥媛 孔钦[1] ZHOU Yue-yuan;KONG Qin(Nanjing University Jinling College,Nanjing 210089,China)

机构地区:[1]南京大学金陵学院,江苏南京210089

出  处:《计算机技术与发展》2020年第5期76-83,共8页Computer Technology and Development

基  金:全国高校计算机基础教学研究与改革课题(AFCEC-2016-18);南京大学金陵学院重点教改项目(0010521816,0010521806)。

摘  要:声纹识别技术实现的关键点在于从语音信号中提取语音特征参数,此参数具备表征说话人特征的能力。基于GMM-UBM模型,通过Matlab实现文本无关的声纹识别系统,对主流静态特征参数MFCC、LPCC、LPC以及结合动态参数的MFCC,从说话人确认与说话人辨认两种应用角度进行性能比较。在取不同特征参数阶数、不同高斯混合度和使用不同时长的训练语音与测试语音的情况下,从理论识别效果、实际识别效果、识别所用时长、识别时长占比等多个方面进行了分析与研究。最终结果表明:在GMM-UBM模式识别方法下,三种静态特征参数中MFCC绝大多数时候具有最佳识别效果,同时其系统识别耗时最长;识别率与语音特征参数的阶数之间并非单调上升关系。静态参数在结合较佳阶数的动态参数时能够提升识别效果;增加动态参数阶数与提高系统识别效果之间无必然联系。The key of the voiceprint recognition technology is to extract speech feature parameters from speech signals,which have the capacity of representing speakers’ features. Based on Gaussian mixture model-universal background model,the mainstream static feature parameters which are MFCC,LPCC,LPC and MFCC combining with dynamic parameters are compared in terms of speaker identification and speaker verification according to text-independent voiceprint recognition systems realized by Matlab. In the case of taking different feature parameter order,different Gaussian mixture degree and using different time length of training and testing speech,we analyze and research in several aspects such as theoretical recognition effect,actual recognition effect,time consumption for recognizing and its ratio of components etc. The final results show that in the pattern recognition method GMM-UBM,MFCC do have best recognition effect and largest time consumption for system recognizing in the most of time. There is no monotonically ascending relation between the recognition rate and the order of speech feature parameters. MFCC combining with better order of dynamic parameters can improve recognition effect. Increasing the order of dynamic parameters is not definitely associate with improving recognition effect.

关 键 词:GMM-UBM 声纹识别 特征参数性能 说话人确认 说话人辨认 

分 类 号:TP301[自动化与计算机技术—计算机系统结构]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象