基于多带解调分析和瞬时频率估计的耳语音话者识别  被引量:12

Whispered speaker identification based on multiband demodulation analysis and instantaneous frequency estimation

在线阅读下载全文

作  者:王敏[1] 赵鹤鸣[1] 

机构地区:[1]苏州大学电子信息学院,苏州215006

出  处:《声学学报》2010年第4期471-476,共6页Acta Acustica

基  金:国家自然科学基金项目资助项目(60572076)

摘  要:为了改善耳语音话者识别的稳健性,提出了一种基于调幅-调频(AM-FM)模型的耳语音特征参数,瞬时频率估计(IFE)。根据语音产生的共振峰调制理论,采用多带解调分析(MDA)获得语音的瞬时包络和频率;然后根据包络幅度和频率的加权估计,得到语音的特征IFE来描绘语音的频率结构。将该特征用于耳语话者识别并和传统的Mel倒谱系数(MFCC)进行了比较。实验结果表明,随着测试人数的增加,IFE的识别效果略好于MFCC;在测试信道改变的情况下,与MFCC相比IFE的稳健性得到了有效的提高。In order to improve the robust performance of whispered speaker indentification,a kind of whispered speech parameter called instantaneous frequency estimation(IFE) is proposed based on the AM-FM representation of speech signal.According to the formant modulation theory of speech production,the instantaneous envelope and frequency of speech are extracted by multiband demodulation analysis(MDA).IFE is then obtained by the weighted estimation both on envelope amplitude and frequency to represent the accurate frequency structure of speech.The proposed speech parameters have been applied for whispered speaker indentification and compared with conventional MFCC.The experiment results show that,as the test objectives increase,the IFE parameters perform as well as MFCC,even a little better.When the test channels are changed,comparing with MFCC,IFE effectively improves the robust performance of system.

关 键 词:瞬时频率估计 话者识别 解调分析 耳语音 语音特征参数 MEL倒谱系数 调制理论 加权估计 

分 类 号:TN912.34[电子电信—通信与信息系统]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象