噪声环境下听觉特征融合的语种识别被引量：1

A language identification method based on auditory feature fusion in noisy environment

作　　者：黄张衡龙华[1] 邵玉斌[1] 杜庆治[1] 苏树盟王延凯 HUANG Zhangheng;LONG Hua;SHAO Yubin;DU Qingzhi;SU Shumeng;WANG Yankai(Faculty of Information Engineering and Automation,Kunming University of Science and Technology,Kunming 650500,China)

机构地区：[1]昆明理工大学信息工程与自动化学院,云南昆明650500

出　　处：《现代电子技术》2023年第5期47-54,共8页Modern Electronics Technique

基　　金：国家自然科学基金项目(61761025)。

摘　　要：针对单一信号特征CFCC与GFCC在低信噪比下识别率不高的问题,提出一种噪声环境下听觉特征融合的语种识别方法。在特征提取前端对含噪语音信号进行端点检测,然后结合谱减法与维纳滤波器对信号进行噪声滤除;再根据人耳听觉频率集中范围采用带通滤波器滤除高频以及低频中噪声,进一步减小噪声对信号特征提取的影响;提取GFCC融入CFCC构成融合特征,再采用主成分分析对融合特征进行降维处理;最后将处理后的融合特征通过频域注意力Fcanet网络模型进行分类识别。实验对比不同特征在不同信噪比下的性能实验表明,融合特征较单一特征语种识别率有显著提升,特别在0 dB信噪比下较单一特征GFCC和CFCC识别准确率分别提升了9.75%和11.08%,具有较强的鲁棒性。In allusion to the problem that the identification rate of CFCC(cochlear filter cepstral coefficient) and GFCC(Gammatone frequency cepstral coefficient)with a single signal feature is not high under low signal-to-noise ratio,a language recognition method based on auditory feature fusion in noisy environment is proposed. In the front-end of feature extraction,endpoint detection for the speech signal with noisy is performed,and the signal is filtered by spectral subtraction and Wiener filter.According to the concentrated range of human auditory frequency,band-pass filter is used to filter out the high and low frequency noise to further reduce the influence of noise on signal feature extraction. The GFCC is extracted and integrated into CFCC to form the fusion features,and then the principal component analysis(PCA) is used to reduce the dimension of the fusion features. The processed fusion features are classified and recognized by means of the frequency domain attention Fcanet network model. The experimental results of comparing the performance of different features under different signal-to-noise ratios show that the language identification rate of the fusion feature method has been improved more than that of the single-feature method,especially under the 0 dB signal-to-noise ratio,its GFCC and CFCC identification accurate rates are 9.75% and11.08% higher than those of the single-feature GFCC and CFCC respectively,and has strong robustness.

关键词：语种识别信号端点检测噪声滤除带通滤波特征提取特征识别降维处理

分类号：TN912.34-34[电子电信—通信与信息系统]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

噪声环境下听觉特征融合的语种识别被引量：1

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

噪声环境下听觉特征融合的语种识别 被引量：1

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

噪声环境下听觉特征融合的语种识别被引量：1