基于深度学习的混响感知麦克风阵列语音增强

Reverberation-aware microphone array speech-enhancement algorithmbased on deep-learning

作　　者：何伟刘雨佶[1,2] 童峰康元勋[3] 冯万健 HE Wei;LIU Yuji;TONG Feng;KANG Yuanxun;FENG Wanjian(College of Ocean and Earth Sciences,Xiamen University,Xiamen 361005,China;National and Local Joint Engineering Research Center for Navigation and Location Service Technology of Xiamen University,Xiamen 361005,China;Xiamen Yealink Network Technology Co.Ltd,Xiamen 361015,China)

机构地区：[1]厦门大学海洋与地球学院,福建厦门361005 [2]导航与位置服务技术国家地方联合工程研究中心(厦门大学),福建厦门361005 [3]厦门亿联网络技术股份有限公司,福建厦门361015

出　　处：《厦门大学学报（自然科学版）》2024年第2期287-295,304,共10页Journal of Xiamen University：Natural Science

基　　金：上海市科委“科技创新行动计划”项目(21DZ1205502);厦门市海洋产业项目(22CZB012HJ13)。

摘　　要：[目的]针对基于深度神经网络频谱估计的麦克风阵列算法存在数据依赖的问题,提出了一种基于深度学习的混响感知麦克风阵列语音增强算法.[方法]首先利用麦克风阵列波束形成输出与原始信号做互相关,以近似房间冲激响应的形式获取当前环境的混响特性作为LSTM网络的输入,网络模型以干净语音为目标进行训练从而输出房间冲激响应泛化向量,最后通过组合近似房间冲激响应与房间冲激响应泛化向量获得后置抗混响滤波器系数,实现语音增强.[结果]仿真和实验结果中,与波束形成、加权预测误差算法和传统深度学习去混响算法相比,所提出的方法在不同混响场景下具有更好的表现.[结论]本文方法在不同混响场景下都具有相对稳定的抗混响能力,具有较好的泛化性能.[Objective]The technique of microphone array has been extensively applied for enhancing speech by means of the exploration of spatial information provided by multiple microphone channel.However,due to diverse reverberation characteristics produced by different sizes,different boundary materials and different reflectors,the speech enhance performance of microphone array are deteriorated significantly.In recent years,the deep-learning optimized microphone array signal processing has been investigated to remedy the problem caused by reverberation,which endures the data dependence and thus cannot adapt to the reverberation scene that is excluded from the training data.In this paper,a novel reverberation-aware(RA)microphone array speech enhancement algorithm is proposed to first obtain the reverberant feature and then design a deep-learning model to decouple the negative impact of environments,thus facilitating environment adaptive microphone array speech enhancement under diverse reverberant scenarios.[Methods]The proposed RA microphone array speech enhancement algorithm consists of training stage and testing stage.Specifically,in the training stage,the simulated reverberant signal is used for obtaining approximate room impulse response(ARIR)by correlating the reverberant signal with its beamforming output.Then,with the clean speech as training target,a RA model is designed by adopting ARIR and the beamformed signal as the training input.Consequently,a diverse room impulse response(RIR)generalized vector(RGV)to generalize the de-reverberation model with respect to RIR as well as the uncontrolled speech can be produced.In the practical testing stage,the practical ARIR is similarly obtained by correlating the received reverberant signal with its beamforming output.Afterward the resulting RGV is used to convolve with the practical ARIR to obtain the coefficients of a post de-reverberation filter,which exerts to remove the reverberation corresponding to ARIR.[Results]Performance of the proposed RA speech enhancement algo

关键词：混响麦克风阵列波束形成房间冲激响应深度学习长短时记忆

分类号：TN912.3[电子电信—通信与信息系统]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于深度学习的混响感知麦克风阵列语音增强

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于深度学习的混响感知麦克风阵列语音增强

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索