Optimizing the Perceptual Quality of Time-Domain Speech Enhancement with Reinforcement Learning  被引量:1

在线阅读下载全文

作  者:Xiang Hao Chenglin Xu Lei Xie Haizhou Li 

机构地区:[1]School of Computer Science,Northwestern Polytechnical University,Xi'an 710000,China [2]Department of Electrical and Computer Engineering,National University of Singapore,Singapore 710129,Singapore

出  处:《Tsinghua Science and Technology》2022年第6期939-947,共9页清华大学学报(自然科学版(英文版)

基  金:supported by the National Research Foundation of Singapore(No.AISG-100E-2018-006);Human-Robot Interaction Phase 1(No.1922500054);under the National Robotics Programme,Singapore.

摘  要:In neural speech enhancement,a mismatch exists between the training objective,i.e.,Mean-Square Error(MSE),and perceptual quality evaluation metrics,i.e.,perceptual evaluation of speech quality and short-time objective intelligibility.We propose a novel reinforcement learning algorithm and network architecture,which incorporate a non-differentiable perceptual quality evaluation metric into the objective function using a dynamic filter module.Unlike the traditional dynamic filter implementation that directly generates a convolution kernel,we use a filter generation agent to predict the probability density function of a multivariate Gaussian distribution,from which we sample the convolution kernel.Experimental results show that the proposed reinforcement learning method clearly improves the perceptual quality over other supervised learning methods with the MSE objective function.

关 键 词:speech enhancement neural networks dynamic filter reinforcement learning 

分 类 号:TN912.35[电子电信—通信与信息系统] TP181[电子电信—信息与通信工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象