检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:李敬阳[1] 吴明辉[2] 王莉[1] 王晓迪[1]
机构地区:[1]公安部物证鉴定中心,北京100038 [2]中国科学技术大学电子科学与技术系,安徽合肥230027
出 处:《计算机应用与软件》2016年第12期131-135,共5页Computer Applications and Software
基 金:北京市科委项目(Z141100006014002)
摘 要:针对说话人确认中话者建模问题,提出GMM-DNN的混合建模方法。该方法先通过GMM提取原始语音特征的统计特征,然后进一步通过DNN非线性映射的方式将统计特征变换到一个与说话人相关的线性可分空间。选用栈式自编码神经网络SAE(Stacked Auto-encoder Neutral Network)作为深度神经网络的基本模型。在注册阶段从已训练的DNN网络中抽取最后一层作为说话人模型,称为p-vector。测试阶段,通过抽取测试语音的p-vector与注册说话人p-vector进行匹配,从而作出判决;另外还详细说明了DNN隐藏层的作用。通过对NIST语料库的实验表明,采用GMM-DNN的说话人确认方法相对于传统的GMM-UBM话者建模方法具有一定的优势。For the problem of speaker modelling in speaker verification,in this paper we present a hybrid modelling method which is based on GMM-DNN. The method extracts the statistical feature of original speech feature by GMM first, and then further transforms the statistical feature to a linear separable space correlated with the speaker by the way of DNN nonlinear mapping. It chooses stacked auto-encoder neutral network (SAE) as the basic model of deep neural network. At the registration stage, it extracts from trained DNN the last hidden layer as speaker model,it is called the p-vector. At the test stage, by extracting the p-vector for voice test and matching it to the p-vector of registered speaker, the method makes the verification decision. In addition, this paper also expatiates on the role of DNN hidden layer. It is demonstrated by the experiments on NIST corPus that the speaker verification method using GMM-DNN has certain advantages relative to the conventional GMM-UBM speaker modelling method.
关 键 词:说话人识别 深度神经网络 高斯混合模型 统计参数
分 类 号:TP3[自动化与计算机技术—计算机科学与技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:18.117.197.188