利用深度全卷积编解码网络的单通道语音增强  被引量:5

Single Channel Speech Enhancement Based on Deep Fully Convolutional Encoder-Decoder Neural Network

在线阅读下载全文

作  者:时文华 张雄伟[1] 邹霞 孙蒙 Shi Wenhua;Zhang Xiongwei;Zou Xia;Sun Meng(Army Engineering University,Nanjing,Jiangsu 210007,China;Air Force Aviation University,Changchun,Jilin 130000,China)

机构地区:[1]陆军工程大学,江苏南京210007 [2]空军航空大学,吉林长春130000

出  处:《信号处理》2019年第4期631-640,共10页Journal of Signal Processing

基  金:国家自然科学基金项目(61471394);江苏省优秀青年基金(BK20180080)

摘  要:针对传统的神经网络未能对时频域的相关性充分利用的问题,提出了一种利用深度全卷积编解码神经网络的单通道语音增强方法。在编码端,通过卷积层的卷积操作对带噪语音的时频表示逐级提取特征,在得到目标语音高级特征表示的同时逐层抑制背景噪声。解码端和编码端在结构上对称,在解码端,对编码端获得的高级特征表示进行反卷积、上采样操作,逐层恢复目标语音。跳跃连接可以很好地解决极深网络中训练时存在的梯度弥散问题,本文在编解码端的对应层之间引入跳跃连接,将编码端特征图信息传递到对应的解码端,有利于更好地恢复目标语音的细节特征。对特征融合和特征拼接两种跳跃连接方式、L1和L2两种训练损失函数对语音增强性能的影响进行了研究,通过实验验证所提方法的有效性。Considering the time frequency correlation characteristics of speech is not well utilized in the conventional deep neural network,a single channel speech enhancement method based on deep encoder-decoder neural network is proposed.At the coding end,the time-frequency representation of noisy speech is extracted step by step through convolution and pooling operations of convolution layer to obtain high level feature representation of the target speech.At the same time,the background noise is suppressed.The decoder and the encoder are symmetrical in structure,and the target speech features are reconstructed from the advanced feature representation obtained in the encoder step through de-convolution and up-sampling operations at decoding end.Skip connections are employed to solve the gradient dispersion problem in very deep neural networks.In this paper,low level feature maps which include the detail information of speech are delivered by skip connections from the coding end to the corresponding decoding end feature map in the decoding end.This will help the decoder recover the detailed features of the target speech better.The network is trained in two ways with L 1 loss and L 2 loss,the performance of two forms of connections,feature fusion and feature concatenation are evaluated in the experiments.The results demonstrate the effectiveness of proposed method.

关 键 词:语音增强 跳跃连接 编解码 卷积神经网络 

分 类 号:TN912.3[电子电信—通信与信息系统]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象