自由表述口语语音评测后验概率估计改进方法  被引量:5

Improved Posterior Probability Estimation Methods for the Freely-Spoken Speech Evaluation

在线阅读下载全文

作  者:许苏魁 戴礼荣[1] 魏思 刘庆峰[1,2] 高前勇 

机构地区:[1]中国科学技术大学语音及语言信息处理国家工程实验室,安徽合肥230027 [2]科大讯飞信息股份有限公司,安徽合肥230088

出  处:《中文信息学报》2017年第2期212-219,共8页Journal of Chinese Information Processing

基  金:国家自然科学基金(61273264)

摘  要:该文研究了两种用于改善深度神经网络声学建模框架下自由表述口语语音评测任务后验概率估计的方法:1)使用RNN语言模型对一遍解码N-best候选做语言模型得分重估计来获得更准确的识别结果以重新估计后验概率;2)借鉴多语种神经网络训练框架,提出将方言数据聚类状态加入解码神经网络输出节点,在后验概率估计中引入方言似然度得分以评估方言程度的新方法。实验表明,这两种方法估计出的后验概率与人工分相关度分别绝对提升了3.5%和1.0%,两种方法融合后相关度绝对提升4.9%;对于一个真实的评测任务,结合该文改进的后验概率评分特征,总体评分相关度绝对提升2.2%。Two methods under the deep neural network acoustic modeling framework are proposed to improve the es- timation of posterior probability for evaluation of pronunciation of freely-spoken speech: 1) the posterior probability is re-estimated with more accurate recognition results by employing RNN language model to re-score the N-best candidates produced from the first decoding process; 2) the influence of dialect to posterior probability is taken into account by involving likelihood scores produced by dialect clustered nodes added to deep neural network acoustic model which is re-trained as a multi-lingual style. Experimental results show that these methods increase the correlation (between posterior probabilities and human scores) for 3.5 % and 1.0 % respectively, and the combination of these two methods achieves 4.9% increase. In a real evaluation task, a 2.2% absolute improvement is observed in eorre lation between machine scores and human scores.

关 键 词:自由表述口语 语音评测 后验概率 深度神经网络 RNN语言模型 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象