汉语连续语音识别之音素声学模型的改进  被引量:7

Improvement of Phoneme Acoustic Modeling in Large Vocabulary Continuous Mandarin Speech Recognition System

在线阅读下载全文

作  者:吕丹桔[1] Mei-Yuh Huang B Hoffmeister 

机构地区:[1]西南林学院计算机与信息科学系,云南昆明650224 [2]微软亚洲研究院,雷蒙得华盛顿美国98052 [3]亚琛工业大学计算机第六研究所,亚琛德国52056

出  处:《计算机仿真》2010年第5期355-358,共4页Computer Simulation

摘  要:研究基于主元音音素基元的声学模型的改进。由于汉语语音特点,主元音模型得到了广泛的应用。通过分析主元音音素模型,发现该模型存在词组音节序列字界线有歧义,从而提出主元音的改进方法以明确音节序列中字的分界,减小基元规模,提高语音系统识别率。为了描述连续语意中的协同发音现象,还针对改进后的主元音基元,设计了相应的有调问题集,利用决策树的参数共享策略建立了上下文相关的音素模型。实验结果表明,改进后的有调音素集合在削减了原有基元个数的基础上,字误识率(CER)有0.4%-0.6%的明显改善。This research studies the improvements of the main vowel phonemes acoustic model. According to the features of Mandarin, main vowel method is widely used. Through analyzing the phoneme model, it is discovered that the decomposition of a word' s pronunciation into a sequence of syllables is not unique. In this paper, methods are developed to optimize the main vowel modeling, including syllable refinement, model size and character error rate reduction. To describe the semantics of the continuous co - articulation phenomenon, this paper designs a set of appropriate questions, and builds a context dependent tri - phone acoustic model based on decision - tree - based state - tying. Experiments show that the improved acoustic model is of less size than the old one and leads to an absolute reduction of character error rate (CER) by about 0.4% -0.6%.

关 键 词:大词汇量连续汉语语音识别 音素 主元音 决策树 

分 类 号:TP912[自动化与计算机技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象