应用AAM损失函数的无文本说话人识别  被引量:3

Additive angular margin loss applied for text-independent speaker recognition

在线阅读下载全文

作  者:肖金壮[1] 李瑞鹏 纪盟盟 XIAO Jinzhuang;LI Ruipeng;JI Mengmeng(College of Electronic Information Engineering,Hebei University,Baoding Hebei 071000,China)

机构地区:[1]河北大学电子信息工程学院,河北保定071000

出  处:《激光杂志》2021年第11期87-91,共5页Laser Journal

基  金:河北省自然科学基金面上项目(No.H2016201201)、河北省高等学校科学技术研究重点项目(No.ZD2016149)。

摘  要:针对无文本说话人识别存在短语音提取特征困难和模型训练效率不高的问题,提出利用附加角裕度的损失函数(Additive angular margin loss,AAM-Softmax)可以在特征表达的角度空间中最大化分类界限的优势,同时结合为提高网络训练效率和稳定性而改进的残差网络ResNet,来获得更具辨别性的嵌入特征,最终达到提升端到端短语音无文本说话人识别模型的性能。实验表明,在说话人辨认任务中Top-1和Top-5的准确度分别达到90.1%和97.8%,说话人确认任务中的等错误率(EER)降低到3.8%,与基于VoxCeleb1数据集的已有成果相比,三种指标的性能皆有明显提升,证明了所提方法的有效性。In view of the difficulty of short utterance feature extraction and low model training efficiency in text-independent speaker recognition.Additive angular margin loss function(AAM-Softmax loss)'s advantages were used to maximize the classification boundary in the angular space of feature expression,then combined with the residual network ResNet,which is modified to improve the efficiency and stability of network training and obtain more discriminative embedded features,ultimately improve the performance of the end-to-end speaker recognition model.The experimental results show that the accuracy of Top-1 and Top-5 in speaker identification tasks reaches 90.1%and 97.8%,respectively,and the Equal Error Rate(EER)in the speaker verification task is reduced to 3.8%.Compared with the published results based on dataset VoxCeleb1,the performance of the three indicators is improved significantly,which validates the effectiveness of proposed method.

关 键 词:附加角裕度损失函数 说话人识别 无文本语音 深度学习 端到端 

分 类 号:TN249[电子电信—物理电子学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象