基于ASA的单声道双人混合语音浊音分离  

Voiced Separation of Monophonic Two-person Mixed Speech Based on ASA

在线阅读下载全文

作  者:李黎 张二华[1] 唐振民[1] LI Li;ZHANG Erhua;TANG Zhenmin(School of Computer Science and Engineering,Nanjing University of Science and Technology,Nanjing 210094)

机构地区:[1]南京理工大学计算机科学与工程学院,南京210094

出  处:《计算机与数字工程》2023年第12期2918-2923,共6页Computer & Digital Engineering

摘  要:单声道语音分离在噪声环境下的语音识别、多媒体检索等场景中具有重要的作用。论文研究了基于听觉场景分析(ASA)的单声道双人混合语音浊音的分离方法,首先通过傅里叶变换将混合语音转换到时频域,然后以基音周期轨迹为线索,利用梳状滤波器提取浊音各次谐波的频谱,最后由傅里叶逆变换重构分离的浊音。论文采用倒谱法估计每帧语音的基音周期,利用基音周期的连续性绘制基音周期谱图估计双人语音的基音轨迹,针对双人混合语音中基音轨迹减弱甚至消失的问题,论文对传统倒谱计算方法进行了改进,核心思想是对传统倒谱计算过程中与频谱内积的三角函数进行正半周削波。实验结果表明,改进的倒谱算法能够增强双人混合语音中减弱的基音轨迹,并使部分消失的基音轨迹得以重现,显著提高分离语音的可懂度。Monophonic speech separation plays an important role in speech recognition and multimedia retrieval in noisy envi-ronment.This paper studies the separation method of monaural two person mixed speech voiced based on auditory scene analysis.Firstly,the mixed speech is converted to the frequency domain by Fourier transform,then the spectrum of each harmonic of voiced speech is extracted by comb filter with the clue of pitch period trajectory,and finally the separated voiced speech is reconstructed by inverse Fourier transform.In this paper,the cepstrum method is used to estimate the pitch period of each frame of speech,and the pitch period spectrum is drawn by using the continuity of pitch period to estimate the pitch trajectory of two person speech.Aiming at the problem that the pitch trajectory weakens or even disappears in two person mixed speech,this paper improves the traditional cepstrum calculation method.The core idea is to carry out positive half cycle clipping on the trigonometric function of the inner prod-uct of the spectrum in the traditional cepstrum calculation.The experimental results show that the improved cepstrum algorithm can enhance the weakened pitch trajectory in the two person mixed speech,reproduce the partially disappeared pitch trajectory,and sig-nificantly improve the intelligibility of the separated speech.

关 键 词:语音分离 单声道 基音增强 倒谱 

分 类 号:TN912.3[电子电信—通信与信息系统]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象