检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:陈思竹 龙华[1] 邵玉斌[1] CHEN Sizhu;LONG Hua;SHAO Yubin(Faculty of Information Engineering and Automation,Kunming University of Science and Technology,Kunming 650500,China;Radio Monitoring Center of Yunnan Province,Kunming 650228,China)
机构地区:[1]昆明理工大学信息工程与自动化学院,昆明650500 [2]云南省无线电监测中心,昆明650228
出 处:《计算机科学》2024年第S02期367-372,共6页Computer Science
基 金:云南省媒体融合重点实验室开放基金(320225403)。
摘 要:针对广播语音信号低信噪比下语种识别准确率低和鲁棒性差的问题,提出了基于小波包变换改进MFCC和能量算子倒谱特征的语种识别算法。首先,采用小波包变换代替MFCC中的傅里叶变换和Mel滤波得到WMFCC特征参数。在保留人耳听觉感知特性的基础上提升语音信号的高频分析能力和分析精确度,克服傅里叶变换的局限性。其次,提取Teager能量算子倒谱,得到语音瞬时能量的特性,与改进的MFCC特征参数融合得到新的特征参数TWMFCC。最后,为进一步提升低信噪比语音的识别效果,提出了VMD自适应维纳滤波去噪算法。通过实验对比了所提特征与传统特征的识别效果,所提特征的平均识别准确率显著提升,带噪语音在未进行语音去噪处理的情况下较传统MFCC高13.02%,有效改善了传统特征在低信噪比下识别准确率低的问题,具有较强的抗噪性和鲁棒性。Aiming at the problem of low accuracy and poor robustness of language recognition under low signal-to-noise ratio of broadcast speech signals,a language recognition algorithm based on wavelet packet transform to improve MFCC and energy operator cepstrum features is proposed.Firstly,the WMFCC feature parameters are obtained by using wavelet packet transform instead of Fourier transform and Mel filter in MFCC.On the basis of retaining the auditory perception characteristics of the human ear,the high-frequency analysis ability and analysis accuracy of the speech signal are improved,and the limitations of the Fourier transform are overcomed.Secondly,the Teager energy operator cepstrum is extracted to obtain the characteristics of the instantaneous energy of the speech,which is fused with the improved MFCC feature parameters to obtain a new feature parameter TWMFCC.Finally,in order to further improve the recognition effect of low SNR speech,a VMD adaptive Wiener filtering denoising algorithm is proposed.The experiment compares the recognition effect of the proposed features with the traditional features.The average recognition accuracy of the proposed features is significantly improved,which is 13.02% higher than that of the traditional MFCC without speech denoising.It effectively alleviates the problem of low recognition accuracy of traditional features under low signal-to-noise ratio,and has strong anti-noise and robustness.
关 键 词:语种识别 MFCC 小波包变换 能量算子倒谱 GMM-UBM
分 类 号:TN912.34[电子电信—通信与信息系统]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.15.0.151