检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
机构地区:[1]桂林电子科技大学信息与通信学院,广西桂林541004
出 处:《计算机工程与设计》2017年第4期1071-1075,共5页Computer Engineering and Design
基 金:广西区自然科学基金项目(2012GXNSFAA053221);广西千亿元产业产学沿用合作基金项目(信科院0168)
摘 要:针对传统的梅尔频率倒谱系数(MFCC)在说话人识别系统中鲁棒性不足的问题,提出一种基于改进幂率归一化倒谱系数(PNCC)特征算法和身份向量(i-vector)训练模型的方法。与传统的MFCC不同,PNCC利用长时帧估计背景噪声;在此基础上,通过多窗谱估计、平滑幅度谱包络和均值方差归一化(MVA)等技术进一步提升其鲁棒性。以i-vector为基准模型,在TIMIT语音库上进行说话人识别实验,实验结果表明,在不同噪声、不同信噪比下,所提算法相比其它特征有最低的等错误率,鲁棒性最强,在信噪比低于10dB的噪声环境中具有更大优势。Focused on the issue that the robustness of traditional Mel frequency cepstral coefficients (MFCC) feature degrades drastically in speaker recognition system, a kind algorithm based on improved power normalized cepstral coefficients (PNCC) and i-vector model was proposed. The difference between traditional MFCC and PNCC was that PNCC used long term frame to esti- mate background noise. On this basis, one way that using multiple windows spectral estimation, smoothing the amplitude spec tral envelope and adopting MVA to enhance its robustness was proposed. The i-vector was set as the baseline system for speaker recognition and test in TIMIT speech database. Experimental results show that for different noises and different signal noise ratios (SNR), the proposed method has the lowest equal error rate and the best robustness, and when SNR is lower than 10 dB, it has greater advantage compared to other algorithms.
关 键 词:幂率归一化倒谱系数 身份向量 均值方差归一化 多窗谱估计 鲁棒性 说话人识别
分 类 号:TP391.4[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.23.59.191