基于瞬时频率估计和特征映射的汉语耳语音话者识别  被引量:5

Speaker Identification with Chinese Whispered Speech Based on Instantaneous Frequency Estimation and Feature Mapping

在线阅读下载全文

作  者:王敏[1] 赵鹤鸣[1] 张庆芳[1] 

机构地区:[1]苏州大学电子信息学院,苏州215006

出  处:《数据采集与处理》2011年第6期686-690,共5页Journal of Data Acquisition and Processing

基  金:国家自然科学基金(60572076;61071215)资助项目

摘  要:耳语音是有别于正常音的一种微弱语音信号,在正常音训练的说话人识别系统中,用耳语音进行识别时会造成系统性能的急速下降。本文在基于语音产生的调幅-调频(AM-FM)模型基础上,采用多带解调分析(Multi-band demodulation analysis,MDA)和能量分离算法(Energy separation algorithm,ESA)计算语音信号的瞬时频率,作为语音的一种特征。随后在基于耳语音和正常音来自不同信道的假设下,对语音的参数做特征映射后再进行训练和识别,以减少信道对系统的影响。实验表明,和传统的MFCC参数相比,加入特征映射后系统的识别率得到提高,且IFE的识别率和稳健性都优于MFCC。Whisper is a special speech production mode different from neutral speech mode. The performance of speaker identification system (SIS), trained mainly with neutral voices, sharply declines when tested with the whispered speech. Based on the AM-FM model represen- tation of speech signal, the multiband demodulation analysis (MDA) and the energy separation algorithm (ESA) are used to compute the instantaneous frequency estimation (IFE) as a char- acter of speech signal. Then, under the condition that whispered speech and neutral speech come from different channels, feature mapping is conducted to reduce the channel effects before SIS training and test. The experimental results show that compared with MFCCs, feature mapping improves the accuracy of the system, and IFE parameter provides better robustness and accuracy results than MFCCs.

关 键 词:耳语音话者识别 AM—FM模型 瞬时频率估计 特征映射 

分 类 号:TN912.3[电子电信—通信与信息系统]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象