基于对抗不变性解散的说话人识别  

Speaker Recognition Based on Adversarial Invariance Disentangled

在线阅读下载全文

作  者:黄多林 刘栋 郑智燊 HUANG Duolin;LIU Dong;ZHENG Zhishen(School of Computer Science and Communication Engineering,Jiangsu University,Zhenjiang 212013)

机构地区:[1]江苏大学计算机科学与通信工程学院,镇江212013

出  处:《计算机与数字工程》2022年第4期833-838,共6页Computer & Digital Engineering

基  金:江苏省大学生创新创业项目(编号:201810299045Z)资助。

摘  要:为提高说话人识别模型的性能,论文提出一种新颖的方法来提取具有鲁棒性的说话人可区分性特征。该方法将说话人映射到两个较低维度的嵌入空间,通过解散对抗和注意力机制,其中一个嵌入空间完成从语音信号的所有其他信息中解散出说话人相关信息,而另一个嵌入空间捕获所有其他无关的干扰因素。实验结果表明,在TIMIT数据集的两类实验设置中,论文方法分别比两个最先进方法提高2.74%和2.86%的识别准确率。并且通过实验分析测试集的损失和识别准确率,得出注意力机制和解散模块对本文方法的说话人识别性能确实有提升。In order to improve the performance of the speaker recognition model,this paper proposes a novel method to extract robust speaker discriminative features.The method maps speaker embedding to two lower-dimensional embedding spaces.By disentangle adversarial training and attention mechanism,one embedded space disentangles speaker-discriminative information from all other information in the speech signal,while the other embedded space captures all other irrelevant distractions.Experimental results show that the method in this paper can improve the recognition accuracy of 2.74%and 2.86%respectively compared with the two state-of-the-art methods in two experimental settings of TIMIT dataset.And through the experimental analysis of the loss and accuracy of the test set,it is concluded that the attention mechanism and disentangle module really improve the speaker recognition performance of the method in this paper.

关 键 词:说话人识别 深度学习 注意力机制 对抗不变性解散 

分 类 号:TN912.3[电子电信—通信与信息系统]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象