检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:李黎 张二华[1] 唐振民[1] LI Li;ZHANG Erhua;TANG Zhenmin(School of Computer Science and Engineering,Nanjing University of Science and Technology,Nanjing 210094)
机构地区:[1]南京理工大学计算机科学与工程学院,南京210094
出 处:《计算机与数字工程》2023年第12期2918-2923,共6页Computer & Digital Engineering
摘 要:单声道语音分离在噪声环境下的语音识别、多媒体检索等场景中具有重要的作用。论文研究了基于听觉场景分析(ASA)的单声道双人混合语音浊音的分离方法,首先通过傅里叶变换将混合语音转换到时频域,然后以基音周期轨迹为线索,利用梳状滤波器提取浊音各次谐波的频谱,最后由傅里叶逆变换重构分离的浊音。论文采用倒谱法估计每帧语音的基音周期,利用基音周期的连续性绘制基音周期谱图估计双人语音的基音轨迹,针对双人混合语音中基音轨迹减弱甚至消失的问题,论文对传统倒谱计算方法进行了改进,核心思想是对传统倒谱计算过程中与频谱内积的三角函数进行正半周削波。实验结果表明,改进的倒谱算法能够增强双人混合语音中减弱的基音轨迹,并使部分消失的基音轨迹得以重现,显著提高分离语音的可懂度。Monophonic speech separation plays an important role in speech recognition and multimedia retrieval in noisy envi-ronment.This paper studies the separation method of monaural two person mixed speech voiced based on auditory scene analysis.Firstly,the mixed speech is converted to the frequency domain by Fourier transform,then the spectrum of each harmonic of voiced speech is extracted by comb filter with the clue of pitch period trajectory,and finally the separated voiced speech is reconstructed by inverse Fourier transform.In this paper,the cepstrum method is used to estimate the pitch period of each frame of speech,and the pitch period spectrum is drawn by using the continuity of pitch period to estimate the pitch trajectory of two person speech.Aiming at the problem that the pitch trajectory weakens or even disappears in two person mixed speech,this paper improves the traditional cepstrum calculation method.The core idea is to carry out positive half cycle clipping on the trigonometric function of the inner prod-uct of the spectrum in the traditional cepstrum calculation.The experimental results show that the improved cepstrum algorithm can enhance the weakened pitch trajectory in the two person mixed speech,reproduce the partially disappeared pitch trajectory,and sig-nificantly improve the intelligibility of the separated speech.
分 类 号:TN912.3[电子电信—通信与信息系统]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.15.26.108