基于多尺度频率通道注意力融合的声纹库构建方法  

Construction method of voiceprint library based on multi-scale frequency-channel attention fusion

在线阅读下载全文

作  者:陈彤 杨丰玉[1] 熊宇 严荭 邱福星 CHEN Tong;YANG Fengyu;XIONG Yu;YAN Hong;QIU Fuxing(School of Software,Nanchang Hangkong University,Nanchang Jiangxi 330063,China)

机构地区:[1]南昌航空大学软件学院,南昌330063

出  处:《计算机应用》2024年第8期2407-2413,共7页journal of Computer Applications

基  金:国家自然科学基金资助项目(61762067)。

摘  要:为解决声纹识别准确性易受外部因素影响的问题,提出一种基于多尺度频率通道注意力融合时延神经网络(MFCA-TDNN)模型的声纹识别算法。MFCA-TDNN在ECAPA-TDNN(Emphasized Channel Attention Propagation Aggregation Time Delay Neural Network)的基础上作了3点改进,包括:加入了多尺度频率通道注意力前端以从话语中获得高分辨率的特征表示、添加了多尺度通道注意力模块结合局部和全局的特征以融合多尺度信息、嵌入了特征注意力融合模块为多尺度的融合特征加权。这些改进使模型更好地利用多尺度的时频信息,提高识别能力。实验结果表明,与ECAPA-TDNN模型相比,MFCA-TDNN模型等错误率(EER)和最小检测代价函数(minDCF)分别下降5.9%和7.9%;最低的EER可达到3.83%,最低的minDCF可达到0.2202。To address the problem that the accuracy of speaker verification is easily affected by external factors,a speaker verification algorithm was proposed based on a Multi-scale Frequency-Channel Attention fused Time-Delay Neural Network(MFCA-TDNN)model.Three improvements were made to MFCA-TDNN on the basis of the ECAPA-TDNN(Emphasized Channel Attention Propagation Aggregation Time Delay Neural Network),including:incorporating a multiscale frequency-channel attention front-end to obtain high-resolution feature representations from speech,adding a multiscale channel attention module to fuse multi-scale information by combining local and global features,and embedding a feature attention fusion module to weight the fusion features of multiple scales.These improvements enabled the model to make better use of multi-scale time-frequency information and improve recognition capability.Experimental results show that compared to the ECAPA-TDNN model,MFCA-TDNN model achieves a reduction of 5.9%and 7.9%in Equal Error Rate(EER)and minimum Detection Cost Function(minDCF),respectively,with the lowest EER of 3.83%and the lowest minDCF of 0.2202.

关 键 词:声纹库 时延神经网络 多尺度特征提取 频率通道注意力 特征注意力融合 

分 类 号:TP183[自动化与计算机技术—控制理论与控制工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象