Speech Enhancement via Mask-Mapping Based Residual Dense Network  

在线阅读下载全文

作  者:Lin Zhou Xijin Chen Chaoyan Wu Qiuyue Zhong Xu Cheng Yibin Tang 

机构地区:[1]School of Information Science and Engineering,Southeast University,Nanjing,210096,China [2]Center for Machine Vision and Signal Analysis,University of Oulu,Oulu,FI-90014,Finland [3]College of IOT Engineering,Hohai University,Changzhou,213022,China

出  处:《Computers, Materials & Continua》2023年第1期1259-1277,共19页计算机、材料和连续体(英文)

基  金:supported by the National Key Research and Development Program of China under Grant 2020YFC2004003 and Grant 2020YFC2004002;the National Nature Science Foundation of China(NSFC)under Grant No.61571106.

摘  要:Masking-based and spectrum mapping-based methods are the two main algorithms of speech enhancement with deep neural network(DNN).But the mapping-based methods only utilizes the phase of noisy speech,which limits the upper bound of speech enhancement performance.Maskingbased methods need to accurately estimate the masking which is still the key problem.Combining the advantages of above two types of methods,this paper proposes the speech enhancement algorithm MM-RDN(maskingmapping residual dense network)based on masking-mapping(MM)and residual dense network(RDN).Using the logarithmic power spectrogram(LPS)of consecutive frames,MM estimates the ideal ratio masking(IRM)matrix of consecutive frames.RDN can make full use of feature maps of all layers.Meanwhile,using the global residual learning to combine the shallow features and deep features,RDN obtains the global dense features from the LPS,thereby improves estimated accuracy of the IRM matrix.Simulations show that the proposed method achieves attractive speech enhancement performance in various acoustic environments.Specifically,in the untrained acoustic test with limited priors,e.g.,unmatched signal-to-noise ratio(SNR)and unmatched noise category,MM-RDN can still outperform the existing convolutional recurrent network(CRN)method in themeasures of perceptual evaluation of speech quality(PESQ)and other evaluation indexes.It indicates that the proposed algorithm is more generalized in untrained conditions.

关 键 词:Mask-mapping-based method residual dense block speech enhancement 

分 类 号:TP18[自动化与计算机技术—控制理论与控制工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象