双模态融合特征下的说话人识别  

Speaker recognition based on bimodal fusion features

在线阅读下载全文

作  者:谢娅利 庞炜千 白静 薛珮芸 赵建星 师晨康 XIE Ya-li;PANG Wei-qian;BAI Jing;XUE Pei-yun;ZHAO Jian-xing;SHI Chen-kang(College of Information and Computer,Taiyuan University of Technology,Jinzhong 030600,China)

机构地区:[1]太原理工大学信息与计算机学院,山西晋中030600

出  处:《计算机工程与设计》2023年第8期2454-2458,共5页Computer Engineering and Design

基  金:山西省应用基础研究计划基金项目(201901D111094);山西省留学回国人员科技活动择优基金项目(20200017);山西省应用基础研究计划基金项目(青年基金20210302124544)。

摘  要:为提高说话人识别的准确率,提出一种双模态融合特征的算法。提取韵律特征和伽玛通滤波倒谱系数两种声学特征,计算其统计特性;提取舌、唇和下颌分别相对于鼻梁的发音动作参数,获得参考点发音动作特征;将声学特征和参考点发音动作特征进行融合,对其进行嵌入式特征选择,获得双模态融合特征;通过支持向量机、高斯混合模型-支持向量机进行分类。实验结果表明,参考点发音动作特征识别效果优于传统发音动作特征识别效果,双模态融合特征识别率明显高于单模态特征的识别率,验证了所提方法的有效性。To improve the accuracy of speaker recognition,a bimodal feature fusion algorithm was presented.Two kinds of acoustic features,prosodic feature and gammatone filter cepstral coefficient were extracted,and their statistical characteristics were calculated.The articulation action parameters of tongue,lip and mandible relative to the bridge of the nose were extracted to obtain reference point articulatory movement features.Acoustic features and reference point articulatory movement features were fused,and embedded feature selection was performed to obtain dual-modal fusion feature.Classification was performed using support vector machine and Gaussian mixture model-support vector machine.Experimental results show that the recognition effect of the reference point articulatory movement features is better than that of the traditional articulatory movement features,and the recognition rate of the dual-modal fusion feature is significantly higher than that of the single-modal feature,the effectiveness of the presented method is demonstrated.

关 键 词:韵律特征 伽玛通滤波倒谱系数 发音动作特征 特征融合 特征选择 高斯混合模型-支持向量机 说话人识别 

分 类 号:TN912.34[电子电信—通信与信息系统]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象