基于改进PNCC和i-vector的说话人识别鲁棒性  被引量:3

Robust speaker recognition based on improved PNCC and i-vector

在线阅读下载全文

作  者:史小元 景新幸[1] 曾敏[1] 杨海燕[1] 

机构地区:[1]桂林电子科技大学信息与通信学院,广西桂林541004

出  处:《计算机工程与设计》2017年第4期1071-1075,共5页Computer Engineering and Design

基  金:广西区自然科学基金项目(2012GXNSFAA053221);广西千亿元产业产学沿用合作基金项目(信科院0168)

摘  要:针对传统的梅尔频率倒谱系数(MFCC)在说话人识别系统中鲁棒性不足的问题,提出一种基于改进幂率归一化倒谱系数(PNCC)特征算法和身份向量(i-vector)训练模型的方法。与传统的MFCC不同,PNCC利用长时帧估计背景噪声;在此基础上,通过多窗谱估计、平滑幅度谱包络和均值方差归一化(MVA)等技术进一步提升其鲁棒性。以i-vector为基准模型,在TIMIT语音库上进行说话人识别实验,实验结果表明,在不同噪声、不同信噪比下,所提算法相比其它特征有最低的等错误率,鲁棒性最强,在信噪比低于10dB的噪声环境中具有更大优势。Focused on the issue that the robustness of traditional Mel frequency cepstral coefficients (MFCC) feature degrades drastically in speaker recognition system, a kind algorithm based on improved power normalized cepstral coefficients (PNCC) and i-vector model was proposed. The difference between traditional MFCC and PNCC was that PNCC used long term frame to esti- mate background noise. On this basis, one way that using multiple windows spectral estimation, smoothing the amplitude spec tral envelope and adopting MVA to enhance its robustness was proposed. The i-vector was set as the baseline system for speaker recognition and test in TIMIT speech database. Experimental results show that for different noises and different signal noise ratios (SNR), the proposed method has the lowest equal error rate and the best robustness, and when SNR is lower than 10 dB, it has greater advantage compared to other algorithms.

关 键 词:幂率归一化倒谱系数 身份向量 均值方差归一化 多窗谱估计 鲁棒性 说话人识别 

分 类 号:TP391.4[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象