检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
机构地区:[1]重庆邮电大学计算机科学与技术研究所,重庆400065
出 处:《重庆邮电大学学报(自然科学版)》2014年第1期117-123,130,共8页Journal of Chongqing University of Posts and Telecommunications(Natural Science Edition)
基 金:重庆市自然科学基金(CSTC 2007BB2445);重庆市教委科学技术研究项目(KJ110522);重庆邮电大学科研基金(A2009-26)~~
摘 要:封闭环境中远距离语音识别会受到混响效果的影响,从而降低语音识别率。混响建模(reverberation modeling for speech recognition,REMOS)是一种在模型域进行混响补偿的新方法,该方法在已知声源位置的情况下能有效提升远距离语音识别精度。但在实际应用中,往往难以预测声源的位置。利用最大后验概率的原理,基于对房间不同区域进行有区别补偿的思想,在按帧的隐马尔可夫模型(hidden Markov model,HMM)补偿的基础上,提出一种在封闭环境中新的模型补偿方法。该方法利用K均值聚类K-means算法对房间冲击响应(room impulse response,RIR)的优化集进行聚类,对所属相同类的混响模型进行合并处理,再把合并后的混响模型载入维特比算法中,对清晰语音的HMM模型进行按帧补偿。最后采用后验概率方法选择最佳补偿,使得模型域的混响补偿能最接近精确补偿。实验证明,该方法能进一步提升远距离语音识别的精度。The distant-talking speech recognition would be affected by reverb in a enclosed environment. As a result, the recognition rate would be greatly reduced. Reverberation modeling for speech recognition(REMOS) is a new method for re- verberate compensation in the model domain; it can improve distant-talking speech recognition accuracy effectively if the sound source location is already known. But in a real application, location of sound source can be hardly to predicted. Based on the principle of maximum a posteriori probability and frame-wise hidden Markov model(HMM) model compensa- tion, a new method for model compensation in a enclosed environment is proposed in this paper. In this method, K-means clustering algorithm is used to cluster room impulse response (RIR) optimized sets, and merge the reverberation model which is in a same kind class, then Viterbi decoding algorithm is loaded, and frame-wise compensation is implemented to the clear speech HMM model. At last, the best compensate model is selected through the maximum a posteriori estimation. It makes model domain reverberate compensation to be closest to the accurate compensation. The experimental results prove that the method can enhance distant-talking speech recognition accuracy further.
关 键 词:混响 混响建模(REMOS) K—means 房间冲击响应 模型补偿
分 类 号:TP391.4[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.66