检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:史王雷 冯爽[1] Shi Wanglei;Feng Shuang(Key Laboratory of Intelligent Financial Media of Ministry of Education,Communication University of China,Beijing 100024,China)
机构地区:[1]中国传媒大学智能融媒体教育部重点实验室,北京100024
出 处:《信息与电脑》2020年第4期145-147,共3页Information & Computer
摘 要:随着深度学习网络模型在生物识别领域的应用,将说话人识别的发展推向一个新的阶段。早期用于说话人识别的深度学习模型主要为深度神经网络(DNN),在一定程度上改善了说话人识别的性能,但模型训练速度和识别精度都有待提升。笔者基于提取局部特征,引入模型训练复杂程度更低的卷积神经网络(CNN),采用跳跃连接的方法,解决了CNN在训练阶段随着卷积层数的增加引起的梯度消失问题,并在训练阶段对话语采用基于注意力机制的由帧级到段级聚合,以及softmax loss、center loss联合监督的方法对模型进行训练,大幅提升了CNN用于说话人识别的性能。With the application of deep learning network model in the field of biometrics,the development of speaker recognition is pushed to a new stage.The early deep learning model for speaker recognition is mainly deep neural network(DNN),which improves the performance of speaker recognition to a certain extent,but its training speed and recognition accuracy still need to be improved.Based on the extraction of local features and convolutional neural network(CNN)that is less complex,this paper introduces the method of jump connection,which solves the problem of gradient disappearance caused by the increase of convolution layer in CNN training stage.Besides the method uses the attention mechanism based utterance level aggregation,and joint supervision method of softmax loss and center loss to train the model,which greatly improves the performance of CNN for speaker recognition.
分 类 号:TN912.34[电子电信—通信与信息系统]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.132.215.146