检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:韩伟[1] 张雄伟[1] 闵刚[1,2] 张启业[3]
机构地区:[1]解放军理工大学 [2]西安通信学院 [3]中国人民解放军96637部队
出 处:《自动化学报》2017年第2期248-258,共11页Acta Automatica Sinica
基 金:国家自然科学基金(61471394;61402519);江苏省自然科学基金(BK20140071;BK20140074)资助~~
摘 要:本文将心理声学掩蔽特性应用于基于深度神经网络(Deep neural network,DNN)的单通道语音增强任务中,提出了一种具有感知掩蔽特性的DNN结构.首先,提出的DNN对带噪语音幅度谱特征进行训练并分别得到纯净语音和噪声的幅度谱估计.其次,利用估计的纯净语音幅度谱计算噪声掩蔽阈值.然后,将噪声掩蔽阈值和估计的噪声幅度谱联合计算得到一个感知增益函数.最后,利用感知增益函数从带噪语音幅度谱中估计出增强语音幅度谱.在TIMIT数据库上,对不同信噪比下的20种噪声进行的仿真实验表明,无论噪声类型是否在语音的训练集中出现,所提出的感知掩蔽DNN都能够在有效去除噪声的同时保持较小的语音失真,增强效果明显优于常见的DNN增强方法以及NMF(Nonnegative matrix factorization)增强方法.A new deep neural network (DNN) is proposed for single-channel speech enhancement, which incorporates the perceptual masking properties of psychoacoustic models. Firstly, the proposed DNN is trained to learn both the clean speech magnitude spectrum and the noise magnitude spectrum from the noisy magnitude spectrum. Secondly, the estimated clean speech magnitude spectrum is used to calculate the noise masking threshold. Then, the noise masking threshold and the estimated noise magnitude spectrum are combined to calculate a perceptual gain function. Finally, the enhanced speech magnitude spectrum are obtained by jointly training the perceptual gain function and the noisy speech magnitude spectrum. Experimental results on TIMIT with 20 noise types at various SNR (signal-noise ratio) levels demonstrate that the proposed perceptual masking DNN can effectively remove the noise while maintaining small speech distortion, so as to obtain better performance than the common DNN methods and the NMF (nonnegative matrix factorization) method, no matter noise conditions are included in the training set or not.
分 类 号:TP183[自动化与计算机技术—控制理论与控制工程] TN912.35[自动化与计算机技术—控制科学与工程]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.145