检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:韩兆兵[1] 张化云[1] 张树武[1] 徐波[1]
机构地区:[1]中国科学院自动化研究所模式识别国家重点实验室,北京100080
出 处:《电子与信息学报》2004年第11期1714-1720,共7页Journal of Electronics & Information Technology
基 金:国家自然科学基金(69835003)
摘 要:与桌面环境相比,电话网络环境下的语音识别率仍然还比较低,为了推动电话语音识别在实际中的应用,提高其识别率成了当务之急,先前的研究表明,电话语音识别率明显下降通常是因为测试和训练环境的电话通道不同引起数据失配造成的,因此该文提出基于统计模型的动态通道补偿算法(SMDC)减少它们之间的差异,采用贝叶斯估计算法动态地跟踪电话通道的时变特性。实验结果表明,大词汇量连续语音识别的字误识率(CER)相对降低约27%,孤立词的词误识率(WER)相对降低约30%。同时,算法的结构时延和计算复杂度也比较小,平均时延约200 ms,可以很好地嵌入到实际电话语音识别应用中。Automatic speech recognition in telecommunications environment still has a lower correct rate compared to its desktop pairs. Improving the performance of telephonequality speech recognition is an urgent problem for its application in those practical fields.Previous works have shown that the main reason for this performance degradation is the var ational mismatch caused by different telephone channels between the testing and training sets. In this paper, they propose an efficient implementation to dynamically compensate this mismatch based on a phone-conditioned prior statistic model for the channel bias.This algorithm uses Bayes' rule to estimate telephone channels and dynamically follows the time-variations within the channels. In their experiments on mandarin Large Vocabulary Continuous Speech Recognition (LVCSR) over telephone lines, the average Character Error Rate (CER) decreases more than 27% when applying this algorithm; in short utterance test,the Word-Error-Rate(WER) relatively reduced 30%. At the same time, the structural delay and computational consumptions required by this algorithm are limited. The average delay is about 200 ms. So it could be embedded into practical telephone-based applications.
关 键 词:电话语音识别 动态通道补偿 最大似然估计 最大后验估计
分 类 号:TP391.42[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.28