基于双耳语音分离和丢失数据技术的鲁棒语音识别算法  被引量:11

Robust speech recognition algorithm based on binaural speech separation and missing data technique

在线阅读下载全文

作  者:周琳[1] 赵一良 朱竑谕 汤一彬[2] ZHOU Lin;ZHAO Yi-liang;ZHU Hong-yu;TANG Yi-bin(Key Laboratory of Underwater Acoustic Signal Processing of Ministry of Education, School of Information Science and Engineering, Southeast University,Nanjing 210096, Jiangsu, China;College of Internet of Things Engineering, Hohai University, Changzhou 213022, Jiangsu, China)

机构地区:[1]东南大学信息与工程学院水声信号处理教育部重点实验室,江苏南京210096 [2]河海大学物联网学院,江苏常州213022

出  处:《声学技术》2019年第5期545-553,共9页Technical Acoustics

基  金:国家自然科学基金(61571106、61501169、61201345);中央高校基本科研业务费专项资金(2242013K30010)

摘  要:鲁棒语音识别技术在人机交互、智能家居、语音翻译系统等方面有重要应用。为了提高在噪声和语音干扰等复杂声学环境下的语音识别性能,基于人耳听觉系统的掩蔽效应和鸡尾酒效应,利用不同声源的空间方位,提出了基于双耳声源分离和丢失数据技术的鲁棒语音识别算法。该算法首先根据目标语音的空间方位信息,在双耳声信号的等效矩形带宽(EquivalentRectangularBandwidth,ERB)子带内进行混合语音信号的分离,从而得到目标语音的数据流。针对分离后目标语音在频域存在频谱数据丢失的问题,利用丢失数据技术修正基于隐马尔科夫模型的概率计算,再进行语音识别。仿真实验表明,由于双耳声源分离方法得到的目标语音数据去除了噪声和干扰的影响,所提出的算法显著提高了复杂声学环境下的语音识别性能。Robust speech recognition has an important application in human-computer interaction, smart home, voice translation system and so on. In order to improve the speech recognition performance in complex acoustic environment with noise and speech interference, a robust speech recognition algorithm based on binaural speech separation and missing data technique is proposed in this paper. First, according to the azimuth of the target sound source, the algorithm separates the mixed data in the sub-bands of equivalent rectangular bandwidth (ERB), and obtains the data flow of the target sound source. Then, in order to solve the problem that the target source loses spectral data in some ERB sub-bands, the probability calculation based on hidden Markov model is modified by using the missing data technique, and finally the reconstructed spectrum data is utilized for speech recognition. The simulation results show that the proposed algo- rithm can improve the performance of speech recognition in complex acoustic environment, because the influence of noise and interference on the target sound source data is neglected after binaural speech separation.

关 键 词:空间听觉 双耳声源分离 丢失数据技术 误识率 

分 类 号:H107[语言文字—汉语]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象