电话语音识别中基于统计模型的动态通道

Dynamic Channel Compensation Based on Statistical Model for Mandarin Speech Recognition over Telephone

机构地区：[1]中国科学院自动化研究所模式识别国家重点实验室,北京100080

出　　处：《电子与信息学报》2004年第11期1714-1720,共7页Journal of Electronics & Information Technology

基　　金：国家自然科学基金(69835003)

摘　　要：与桌面环境相比,电话网络环境下的语音识别率仍然还比较低,为了推动电话语音识别在实际中的应用,提高其识别率成了当务之急,先前的研究表明,电话语音识别率明显下降通常是因为测试和训练环境的电话通道不同引起数据失配造成的,因此该文提出基于统计模型的动态通道补偿算法(SMDC)减少它们之间的差异,采用贝叶斯估计算法动态地跟踪电话通道的时变特性。实验结果表明,大词汇量连续语音识别的字误识率(CER)相对降低约27％,孤立词的词误识率(WER)相对降低约30％。同时,算法的结构时延和计算复杂度也比较小,平均时延约200 ms,可以很好地嵌入到实际电话语音识别应用中。Automatic speech recognition in telecommunications environment still has a lower correct rate compared to its desktop pairs. Improving the performance of telephonequality speech recognition is an urgent problem for its application in those practical fields.Previous works have shown that the main reason for this performance degradation is the var ational mismatch caused by different telephone channels between the testing and training sets. In this paper, they propose an efficient implementation to dynamically compensate this mismatch based on a phone-conditioned prior statistic model for the channel bias.This algorithm uses Bayes' rule to estimate telephone channels and dynamically follows the time-variations within the channels. In their experiments on mandarin Large Vocabulary Continuous Speech Recognition (LVCSR) over telephone lines, the average Character Error Rate (CER) decreases more than 27% when applying this algorithm; in short utterance test,the Word-Error-Rate(WER) relatively reduced 30%. At the same time, the structural delay and computational consumptions required by this algorithm are limited. The average delay is about 200 ms. So it could be embedded into practical telephone-based applications.

关键词：电话语音识别动态通道补偿最大似然估计最大后验估计

分类号：TP391.42[自动化与计算机技术—计算机应用技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

电话语音识别中基于统计模型的动态通道

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

电话语音识别中基于统计模型的动态通道

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索