检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:彭月 蒙祖强[1] 杨丽娜 PENG Yue;MENG Zu-qiang;YANG Li-na(School of Computers,Electronic Information,Guangxi University,Nanning 530004,China)
机构地区:[1]广西大学计算机与电子信息学院,广西南宁530004
出 处:《广西大学学报(自然科学版)》2021年第6期1533-1548,共16页Journal of Guangxi University(Natural Science Edition)
基 金:国家自然科学基金资助项目(61762009)。
摘 要:对语音增强的方法研究开始于20世纪70年代,目前形成了4大类传统的语音增强方法,包括谐波增强法、谱减法、基于语音生成模型的算法和基于短时谱估计的算法。但语音信号本身为非平稳信号,无论时域分析或者频域分析,其本身的信号特征均不明显,同时噪声信号常常多个叠加,特征复杂、频带宽,现有语音增强效果并不理想,甚至容易引入音乐噪声。语音交流是人类的基本沟通交流方式,用途广泛,但是在语音通讯的过程中不可避免的会受到来自环境噪声、电气噪声、传输介质等干扰,干扰后将影响人的收听辨识效果或者影响其他语音信号的处理(如语音识别)。因此,有必要在音频数字化后实行适当的增强措施来提高辨识度。基于此,提出一种综合了多种方法的新语音增强处理结构。该结构结合短时傅里叶变换、谱减法、噪声谱估计和机器学习技术等,实现更强的语音增强效果。通过与前馈BP网络及LSTM网络对比,实验证明了该方法的有效性。并验证使用GPU计算技术加速的可行性。Research on speech enhancement methods began in the 1970s,and there have been four traditional types of speech enhancement methods,namely harmonic enhancement,spectral subtraction methods,algorithms based on speech generation models and algorithms based on short-time spectral estimation.However,the speech signal is a non-smooth one,and it has an obscure signal characteristic.Since the noise signal is a multiple superimposed signal,it leads to poor speech enhancement and even tends to introduce music noise.Voice communication is the widely-used basic communication method for human beings,but in the process of voice communication it will inevitably be subject to interference from environmental noise,electrical noise,transmission media and other interference,which will affect human listening recognition or the processing of other voice signals(such as speech recognition).Therefore,it is necessary to implement appropriate enhancements to improve the recognition rate after the audio has been digitized.In this paper,we propose a new speech enhancement processing architecture that integrates multiple approaches to address the shortcomings of existing processing methods.The structure combines short-time Fourier transform,spectral subtraction,noise spectrum estimation and machine learning techniques to achieve stronger speech enhancement.The effectiveness of the method is experimentally demonstrated by comparing it with feedforward Back Propagation networks and LSTM networks.The research also verifies the feasibility of using GPU computing techniques for acceleration.
分 类 号:TN912.35[电子电信—通信与信息系统]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.33