联合深度神经网络和凸优化的单通道语音增强算法被引量：5

Monaural speech enhancement combining deep neural network and convex optimation

作　　者：张晓艳张天骐[1] 葛宛营白杨柳 ZHANG Xiaoyan;ZHANG Tianqi;GE Wanying;BAI Yangliu(School of Communication and Information Engineering/Chongqing Key Laboratory of Signal and Information Processing,Chongqing University of Posts and Telecommunications,Chongqing 400065)

机构地区：[1]重庆邮电大学通信与信息工程学院/信号与信息处理重庆市重点实验室,重庆400065

出　　处：《声学学报》2021年第3期471-480,共10页Acta Acustica

基　　金：国家自然科学基金项目(61671095,61702065,61701067,61771085);信号与信息处理重庆市市级重点实验室建设项目(CSTC2009CA2003);重庆市研究生科研创新项目(CYS19248);重庆市教育委员会科研项目(KJ1600427,KJ1600429)的资助。

摘　　要：噪声估计的准确性直接影响语音增强算法的好坏,为提升当前语音增强算法的噪声抑制效果,有效求解无约束优化问题,提出一种联合深度神经网络(DNN)和凸优化的时频掩蔽优化算法进行单通道语音增强。首先,提取带噪语音的能量谱作为DNN的输入特征;接着,将噪声与带噪语音的频带内互相关系数(ICC Factor)作为DNN的训练目标;然后,利用DNN模型得到的互相关系数构造凸优化的目标函数;最后,联合DNN和凸优化,利用新混合共轭梯度法迭代处理初始掩蔽,通过新的掩蔽合成增强语音。仿真实验表明,在不同背景噪声的低信噪比下,相比改进前,新的掩蔽使增强语音获得了更好的对数谱距离(LSD)、主观语音质量(PESQ)、短时客观可懂度(STOI)和分段信噪比(segSNR)指标,提升了语音的整体质量并且可以有效抑制噪声。The accuracy of noise estimation directly affects the quality of speech enhancement algorithm.To improve the noise suppression effect of current speech enhancement algorithm when noise is estimated and effectively solve the unconstrained optimization problem,a time-frequency mask algorithm based on DNN(Deep Netual Networks)combined with convex optimization is proposed for monaural speech enhancement.Firstly,the power spectra of noisy speech is extracted as the input of DNN;Secondly,the inter-channel correlation factor between noise and speech is taken as the training target of DNN;Then,the objective function of convex optimization is constructed by using the correlation factor obtained from DNN model;Finally,new hybrid conjugate gradient method based on DNN combined with convex optimization,is used to perform iterative processing for initial mask.The final updated mask is used to obtain the enhanced speech.Simulation experimental results show that under different background noise with low SNR,compared with conventional methods,the obtained ratio mask makes the enhanced speech obtain better LSD(Log Spectral Distance),PESQ(Perceptual Evaluation of Speech Quality),STOI(Short-Time Objective Intelligibility)and segSNR(segmental Signal to Noise Ratio)indices,and improves the overall quality of speech and can effectively suppress noise.

关键词：带噪语音梯度下降法 DNN 抑制噪声互相关系数低信噪比增强算法凸优化深度神经网络

分类号：TN912.35[电子电信—通信与信息系统] TP183[电子电信—信息与通信工程]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

联合深度神经网络和凸优化的单通道语音增强算法被引量：5

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

联合深度神经网络和凸优化的单通道语音增强算法 被引量：5

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

联合深度神经网络和凸优化的单通道语音增强算法被引量：5