谐波结构相位估计联合幅度补偿的语音增强方法

Speech enhancement method combining phase estimation of harmonic structures and amplitude compensation

作　　者：董娴邵玉斌[1] 杜庆治[1] 龙华[1] 马迪南 DONG Xian;SHAO Yubin;DU Qingzhi;LONG Hua;MA Dinan(Faculty of Information Engineering and Automation,Kunming University of Science and Technology,Kunming 650500,P.R.China;Yunnan Provincial Key Laboratory of Media Convergence,Kunming 650500,P.R.China)

机构地区：[1]昆明理工大学信息工程与自动化学院,昆明650500 [2]云南省媒体融合重点实验室,昆明650500

出　　处：《重庆邮电大学学报（自然科学版）》2024年第5期935-944,共10页Journal of Chongqing University of Posts and Telecommunications(Natural Science Edition)

基　　金：云南省媒体融合重点实验室项目(220235205)。

摘　　要：传统语音增强方法,通常只对含噪语音信号的幅度进行增强处理,忽略了相位信息。相位谱在语音的可懂度和感知质量方面也具有积极影响。为了解决传统语音增强方法对相位处理的不足以及在语音增强过程中普遍存在的可理解性较低的问题,提出一种谐波结构相位估计联合幅度补偿的语音增强方法,该方法强调谐波结构的相位估计恢复浊音信息,并针对由浊音相位估计后清音特征被背景噪声破坏引起的语音信息丢失问题,提出一种平滑背景噪声策略来抑制噪声的影响。此外,还利用谐波比分离谐波结构和清音特征,基于谐波判决来判断是否应该平滑背景噪声,为避免不可靠的判决导致清音特征损失,在谐波判决的基础上还增加了清音特征信息的利用。实验结果表明,该方法在白噪声环境下的信噪比(signal to noise ratio,SNR)、语音质量感知评估(perceptual evaluation of speech quality,PESQ)和短时客观可懂度(short-time objective intelligibility,STOI)分别提升了12.02 dB,1.03和0.07,进一步证实该方法可以有效减少语音失真,提高噪声环境下语音信号质量和可懂度。Traditional speech enhancement methods typically focus only on enhancing the amplitude of noisy speech signals,often neglecting phase information.However,the phase spectrum has a positive impact on speech intelligibility and perceived quality.To address the lack of phase processing in traditional speech enhancement methods and the common issue of low intelligibility,this paper proposes a speech enhancement method that combines harmonic structure phase estimation with amplitude compensation.The method emphasizes phase estimation of the harmonic structure to restore voiced sounds.To counter the speech information loss caused by background noise interference with unvoiced sounds after voiced sound phase estimation,a noise-smoothing strategy is introduced to suppress the influence of noise.Additionally,harmonic ratios are used to separate harmonic structure and unvoiced sound features,and harmonic-based judgment determines whether background noise should be smoothed.To prevent unreliable judgments from leading to the loss of unvoiced sound features,unvoiced sound feature information is further incorporated based on the harmonic judgment.Experimental results show that in white noise environments,the proposed method improves the signal-to-noise ratio(SNR)by 12.02 dB,the perceptual evaluation of speech quality(PESQ)by 1.03,and the short-time objective intelligibility(STOI)by 0.07.These results confirm that the method effectively reduces speech distortion and improves the quality and intelligibility of speech signals in noisy environments.

关键词：相位估计语音增强谐波结构幅度补偿

分类号：TN912.3[电子电信—通信与信息系统]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

谐波结构相位估计联合幅度补偿的语音增强方法

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

谐波结构相位估计联合幅度补偿的语音增强方法

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索