基于注意力机制的联合监督端到端说话人识别模型  

End-to-end Speaker Recognition Model for Joint Supervision Based on Attention Mechanism

在线阅读下载全文

作  者:史王雷 冯爽[1] Shi Wanglei;Feng Shuang(Key Laboratory of Intelligent Financial Media of Ministry of Education,Communication University of China,Beijing 100024,China)

机构地区:[1]中国传媒大学智能融媒体教育部重点实验室,北京100024

出  处:《信息与电脑》2020年第4期145-147,共3页Information & Computer

摘  要:随着深度学习网络模型在生物识别领域的应用,将说话人识别的发展推向一个新的阶段。早期用于说话人识别的深度学习模型主要为深度神经网络(DNN),在一定程度上改善了说话人识别的性能,但模型训练速度和识别精度都有待提升。笔者基于提取局部特征,引入模型训练复杂程度更低的卷积神经网络(CNN),采用跳跃连接的方法,解决了CNN在训练阶段随着卷积层数的增加引起的梯度消失问题,并在训练阶段对话语采用基于注意力机制的由帧级到段级聚合,以及softmax loss、center loss联合监督的方法对模型进行训练,大幅提升了CNN用于说话人识别的性能。With the application of deep learning network model in the field of biometrics,the development of speaker recognition is pushed to a new stage.The early deep learning model for speaker recognition is mainly deep neural network(DNN),which improves the performance of speaker recognition to a certain extent,but its training speed and recognition accuracy still need to be improved.Based on the extraction of local features and convolutional neural network(CNN)that is less complex,this paper introduces the method of jump connection,which solves the problem of gradient disappearance caused by the increase of convolution layer in CNN training stage.Besides the method uses the attention mechanism based utterance level aggregation,and joint supervision method of softmax loss and center loss to train the model,which greatly improves the performance of CNN for speaker recognition.

关 键 词:说话人识别 卷积神经网络 聚合 联合监督 

分 类 号:TN912.34[电子电信—通信与信息系统]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象