基于改进MFCC和能量算子倒谱的语种识别

Language Recognition Based on Improved MFCC and Energy Operator Cepstrum

作　　者：陈思竹龙华[1] 邵玉斌[1] CHEN Sizhu;LONG Hua;SHAO Yubin(Faculty of Information Engineering and Automation,Kunming University of Science and Technology,Kunming 650500,China;Radio Monitoring Center of Yunnan Province,Kunming 650228,China)

机构地区：[1]昆明理工大学信息工程与自动化学院,昆明650500 [2]云南省无线电监测中心,昆明650228

出　　处：《计算机科学》2024年第S02期367-372,共6页Computer Science

基　　金：云南省媒体融合重点实验室开放基金(320225403)。

摘　　要：针对广播语音信号低信噪比下语种识别准确率低和鲁棒性差的问题,提出了基于小波包变换改进MFCC和能量算子倒谱特征的语种识别算法。首先,采用小波包变换代替MFCC中的傅里叶变换和Mel滤波得到WMFCC特征参数。在保留人耳听觉感知特性的基础上提升语音信号的高频分析能力和分析精确度,克服傅里叶变换的局限性。其次,提取Teager能量算子倒谱,得到语音瞬时能量的特性,与改进的MFCC特征参数融合得到新的特征参数TWMFCC。最后,为进一步提升低信噪比语音的识别效果,提出了VMD自适应维纳滤波去噪算法。通过实验对比了所提特征与传统特征的识别效果,所提特征的平均识别准确率显著提升,带噪语音在未进行语音去噪处理的情况下较传统MFCC高13.02%,有效改善了传统特征在低信噪比下识别准确率低的问题,具有较强的抗噪性和鲁棒性。Aiming at the problem of low accuracy and poor robustness of language recognition under low signal-to-noise ratio of broadcast speech signals,a language recognition algorithm based on wavelet packet transform to improve MFCC and energy operator cepstrum features is proposed.Firstly,the WMFCC feature parameters are obtained by using wavelet packet transform instead of Fourier transform and Mel filter in MFCC.On the basis of retaining the auditory perception characteristics of the human ear,the high-frequency analysis ability and analysis accuracy of the speech signal are improved,and the limitations of the Fourier transform are overcomed.Secondly,the Teager energy operator cepstrum is extracted to obtain the characteristics of the instantaneous energy of the speech,which is fused with the improved MFCC feature parameters to obtain a new feature parameter TWMFCC.Finally,in order to further improve the recognition effect of low SNR speech,a VMD adaptive Wiener filtering denoising algorithm is proposed.The experiment compares the recognition effect of the proposed features with the traditional features.The average recognition accuracy of the proposed features is significantly improved,which is 13.02% higher than that of the traditional MFCC without speech denoising.It effectively alleviates the problem of low recognition accuracy of traditional features under low signal-to-noise ratio,and has strong anti-noise and robustness.

关键词：语种识别 MFCC 小波包变换能量算子倒谱 GMM-UBM

分类号：TN912.34[电子电信—通信与信息系统]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于改进MFCC和能量算子倒谱的语种识别

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于改进MFCC和能量算子倒谱的语种识别

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索