检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:罗春梅 张风雷 LUO Chunmei;ZHANG Fenglei(School of Chemical and Mechanical Engineering,Eastern Liaoning University,Dandong 118000,Liaoning,China)
机构地区:[1]辽东学院化工与机械学院,辽宁丹东118000
出 处:《声学技术》2021年第4期503-507,共5页Technical Acoustics
基 金:辽宁省教育厅科学研究项目(LNSJYT201904)。
摘 要:为提高神经网络在说话人识别应用中的识别性能,提出基于高斯增值矩阵特征和改进深度卷积神经网络的说话人识别算法。算法首先通过最大后验概率提取基于梅尔频率倒谱系数(Mel Frequency Cepstrum Coefficient,MFCC)特征的高斯均值矩阵,并对特征进行噪声适应性补偿,以增强信号的帧间关联和说话人特征信息,然后采用改进的深度卷积神经网络进一步对准帧间信息,以提高说话人识别特征对背景噪声的适应性。实验结果表明,相比于高斯混合模型-通用背景模型等识别框架及传统MFCC等特征,该算法可取得更高的识别准确率和最小的识别均方误差。In order to improve the recognition performance,a speaker recognition algorithm based on Gaussian valueadded matrix features and improved deep convolutional neural network is proposed.In the algorithm,the adaptive Gaussian mean matrix based on Mel frequency cepstrum coefficient(MFCC)features is first extracted by the maximum posterior probability,and the noise adaptive compensation for features is performed to enhance interframe correlation and speaker feature information.Then,an improved deep convolutional neural network is used to further align the interframe information to improve the feature learning for speaker recognition and the adaptability to the back-ground noise environment.The experimental results show that,compared with Gaussian mixture model-general background model(GMM-UBM)framework and traditional MFCC features,the algorithm proposed in this paper achieves the best recognition accuracy and the least recognition mean square error.
关 键 词:说话人识别 梅尔频率倒谱系数(MFCC) 深度卷积神经网络 高斯均值矩阵
分 类 号:TN912.34[电子电信—通信与信息系统]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:18.224.2.133