一种基于GRU神经网络的语音增强方法被引量：4

Speech enhancement method based on GRU neural networks

作　　者：彭月蒙祖强[1] 杨丽娜 PENG Yue;MENG Zu-qiang;YANG Li-na(School of Computers,Electronic Information,Guangxi University,Nanning 530004,China)

机构地区：[1]广西大学计算机与电子信息学院,广西南宁530004

出　　处：《广西大学学报（自然科学版）》2021年第6期1533-1548,共16页Journal of Guangxi University（Natural Science Edition）

基　　金：国家自然科学基金资助项目(61762009)。

摘　　要：对语音增强的方法研究开始于20世纪70年代,目前形成了4大类传统的语音增强方法,包括谐波增强法、谱减法、基于语音生成模型的算法和基于短时谱估计的算法。但语音信号本身为非平稳信号,无论时域分析或者频域分析,其本身的信号特征均不明显,同时噪声信号常常多个叠加,特征复杂、频带宽,现有语音增强效果并不理想,甚至容易引入音乐噪声。语音交流是人类的基本沟通交流方式,用途广泛,但是在语音通讯的过程中不可避免的会受到来自环境噪声、电气噪声、传输介质等干扰,干扰后将影响人的收听辨识效果或者影响其他语音信号的处理(如语音识别)。因此,有必要在音频数字化后实行适当的增强措施来提高辨识度。基于此,提出一种综合了多种方法的新语音增强处理结构。该结构结合短时傅里叶变换、谱减法、噪声谱估计和机器学习技术等,实现更强的语音增强效果。通过与前馈BP网络及LSTM网络对比,实验证明了该方法的有效性。并验证使用GPU计算技术加速的可行性。Research on speech enhancement methods began in the 1970s,and there have been four traditional types of speech enhancement methods,namely harmonic enhancement,spectral subtraction methods,algorithms based on speech generation models and algorithms based on short-time spectral estimation.However,the speech signal is a non-smooth one,and it has an obscure signal characteristic.Since the noise signal is a multiple superimposed signal,it leads to poor speech enhancement and even tends to introduce music noise.Voice communication is the widely-used basic communication method for human beings,but in the process of voice communication it will inevitably be subject to interference from environmental noise,electrical noise,transmission media and other interference,which will affect human listening recognition or the processing of other voice signals(such as speech recognition).Therefore,it is necessary to implement appropriate enhancements to improve the recognition rate after the audio has been digitized.In this paper,we propose a new speech enhancement processing architecture that integrates multiple approaches to address the shortcomings of existing processing methods.The structure combines short-time Fourier transform,spectral subtraction,noise spectrum estimation and machine learning techniques to achieve stronger speech enhancement.The effectiveness of the method is experimentally demonstrated by comparing it with feedforward Back Propagation networks and LSTM networks.The research also verifies the feasibility of using GPU computing techniques for acceleration.

关键词：语音增强 GRU神经网络 GPU计算

分类号：TN912.35[电子电信—通信与信息系统]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

一种基于GRU神经网络的语音增强方法被引量：4

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

一种基于GRU神经网络的语音增强方法 被引量：4

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

一种基于GRU神经网络的语音增强方法被引量：4