基于发音特征的汉语普通话语音声学建模被引量：14

Tonal articulatory feature-based acoustic modeling for Chinese Putonghua speech recognition

机构地区：[1]中国科学院声学研究所中科信利语音实验室,北京100190

出　　处：《声学学报》2010年第2期254-260,共7页Acta Acustica

基　　金：国家科技支撑计划(2008BAI50B00);国家自然科学基金(10925419;90920302;10874203;60875014)资助项目

摘　　要：将表征汉语普通话语音特点的发音特征引入汉语普通话语音识别的声学建模中,根据普通话发音特点,确定了用于区别普通话元音、辅音以及声调信息的9种发音特征,并以此为目标值训练神经网络得到语音信号属于各类发音特征的后验概率,将此概率作为语音识别的输入特征建立声学模型。在汉语普通话非特定人大词表自然口语对话识别系统中进行了实验验证,并与基于频谱特征的声学模型进行了比较,在相同解码速度下,由此方法建立的声学模型汉字错误率相对下降6.8%;将发音特征和频谱特征进行了融合实验,融合以后的识别系统相对基于频谱特征系统的汉字错误率相对下降10.1%。上述结果表明,基于发音特征的声学模型更加有效的实现了对语音特性的表征,通过利用发音特征和频谱特征的互补性,能够进一步实现对语音识别性能的提高。The development of a Chinese Putonghua conversational Large Vocabulary Continues Speech Recognition （LVCSR） system using tonal articulatory tandem features is presented. A set of nine Articulatory Features （AF） that are used for classifying sounds and tones of Chinese Putonghua is given, and the posteriors of these nine AF classifiers are used as features in the Automatic Speech Recognition （ASR）. In the experiment on Chinese Putonghua conversational LVCSR, compared with baseline ASR using standard acoustic features, the tonal AF-based ASR has a 6.8~ decrease on Character Error Rate （CER）. When the AF combinations with standard acoustic features at feature-level and word-level, the CER achieves 10.1% relative reduction. These results prove that the AF is effective to capture the characteristics of the speech pronunciations, and with the complementary information provided by standard acoustic features and AF, the combination system achieves better performances further.

关键词：汉语普通话语音识别输入特征声学建模发音声学模型频谱特征后验概率

分类号：TN912.3[电子电信—通信与信息系统]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于发音特征的汉语普通话语音声学建模被引量：14

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于发音特征的汉语普通话语音声学建模 被引量：14

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

基于发音特征的汉语普通话语音声学建模被引量：14