基于Patch域对抗训练的语音增强  

Patch-based domain adversarial training for speech enhancement

在线阅读下载全文

作  者:王鸿韬 陆志华 叶庆卫[1] 章联军[1] WANG Hongtao;LU Zhihua;YE Qingwei;ZHANG Lianjun(Faculty of Electrical Engineering and Computer Science,Ningbo University,Ningbo 315211,China)

机构地区:[1]宁波大学信息科学与工程学院,浙江宁波315211

出  处:《电信科学》2024年第10期52-60,共9页Telecommunications Science

摘  要:在基于深度学习的语音增强方法中,往往会遇到训练数据和测试数据分布不匹配的问题,这种不匹配包括两个数据中说话人、说话内容、噪声类型及信噪比的不匹配。严重的数据不匹配问题会导致语音增强的性能大幅下降,针对这种情况提出了一种基于Patch域对抗训练的语音增强方法。该方法在先前域对抗训练的语音增强方法基础上,通过域判别器的隐式建模,能使整段语音被划分为多个独立Patch再进行判别,实现了对训练数据的适应性学习,从而减小训练数据和测试数据之间的分布差异,提高了模型在测试数据上的增强能力。实验结果表明,该方法在不同程度的数据不匹配问题下较先前方法都表现出优异的性能,且作为对抗训练也保持了良好的稳定性。In deep learning-based speech enhancement methods,mismatched distributions between training data and test data are often encountered.These mismatches can include differences in speakers,speech content,noise types,and signal-to-noise ratios between the datasets.Severe data mismatches can significantly degrade the performance of speech enhancement.To address this issue,a speech enhancement method based on Patch domain adversarial training was proposed.Building on previous domain adversarial training methods for speech enhancement,implicit modeling of a domain discriminator was employed,allowing the entire speech signal to be divided into multiple independent patches for discrimination.Adaptive learning of the training data was enabled,thereby reducing distribution differences between the training and test data and improving the model’s enhancement capabilities on test data.Experimental results show that this method exhibits superior performance compared to previous methods under various degrees of data mismatch and maintains good stability as an adversarial training approach.

关 键 词:语音增强 域对抗训练 领域自适应 

分 类 号:TN915[电子电信—通信与信息系统]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象