Monaural speech enhancement using U-net fused with multi-head self-attention  

在线阅读下载全文

作  者:FAN Junyi YANG Jibin ZHANG Xiongwei ZHENG Changyan 

机构地区:[1]Graduate School,Army Engineering University,Nanjing 210007 [2]College of Command and Control Engineering,Army Engineering University,Nanjing 210007 [3]Department of Test Control,High-tech Institute,Qingzhou 262500

出  处:《Chinese Journal of Acoustics》2023年第1期98-118,共21页声学学报(英文版)

基  金:supported by the National Natural Science Foundation of China(62071484)。

摘  要:Under low signal-to-noise ratio(SNR)and burst noise conditions,the speech enhancement effect of existing deep learning network models is not satisfactory.In contrast,humans can exploit the long-term correlation of speech to form an integrated perception of different speech signals.Thus,describing the long-term dependencies of speech can help improve the enhancement performance under low SNR and burst noise conditions.Inspired by this feature,a time domain end-to-end monaural speech enhancement model TU-net that fuses the multi-head self-attention mechanism and U-net deep network is proposed.The TU-net model adopts the codec layer structure of U-net to achieve multi-scale feature fusion.It introduces the dual-path Transformer module using the multi-head self-attention mechanism to calculate the speech mask and better model long-term correlation.The TU-net model is trained with a weighted sum loss function in the time,time-frequency,and perceptual domains.Simulation experiments are carried out and the results show that with maintaining relatively fewer network model parameters,TU-net outperforms other similar monaural enhancement network models in several evaluation metrics such as perceptual evaluation of speech quality(PESQ),short-time objective intelligibility(STOI)and SNR gain under low SNR and burst noise conditions.

关 键 词:network SPEECH noise 

分 类 号:TP183[自动化与计算机技术—控制理论与控制工程] TN912.35[自动化与计算机技术—控制科学与工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象