一种基音频率归一化的语种识别方法  

A language identification method based on normalization of pitch frequency

在线阅读下载全文

作  者:段云 邵玉斌[1] 刘晶 龙华[1] 杜庆治[1] DUAN Yun;SHAO Yubin;LIU Jing;LONG Hua;DU Qingzhi(Faculty of Information Engineering and Automation,Kunming University of Science and Technology,Kunming 650500,China)

机构地区:[1]昆明理工大学信息工程与自动化学院,云南昆明650500

出  处:《微电子学与计算机》2023年第5期20-28,共9页Microelectronics & Computer

基  金:国家自然科学基金资助项目(61761025)。

摘  要:针对说话人发音特征影响语种辨识,导致识别性能不佳的问题,提出一种语音基音频率归一化的语种识别方法.首先,根据端点检测区分出语音中的有话段和无话段,并在有话段中提取基音频率并进行归一化处理产生声门脉冲.其次,提取声道响应,将声门脉冲和声道响应通过全极点滤波器重构出基音频率归一化的语音,最后,提取底层声学特征在ResNet网络中进行后端语种识别验证.实验结果表明,所提方法可以降低说话人发音特征对语种区分特征的影响,在灰度语谱图中效果显著,识别率达到94.3%.对MFCC、GFCC等传统底层声学特征以及改进的时域GF特征进行识别验证,所提方法的识别率均有3~4%幅度的提升.有效减小了说话人发音特征的影响,提高了语种识别性能.To address the problem that speaker pronunciation features affect language identification and lead to poor recognition performance,a speech fundamental frequency normalization method is proposed.Firstly,the speech segments with and without speech are distinguished based on the endpoint detection,and the fundamental frequency is extracted from the speech segments and normalized to produce the voice-gated pulses.Then,we extract the vocal channel response,reconstruct the normalized speech with the fundamental frequency through the all-pole filter,and finally extract the underlying acoustic features for back-end language identification in the ResNet network.The experimental results show that the proposed method can reduce the influence of speaker pronunciation features on language differentiation features,and it is effective in gray-scale speech spectrograms,with a recognition rate of 94.3%.The recognition rate of the proposed method is improved by 3~4%for both the traditional underlying acoustic features such as MFCC and GFCC and the improved time-domain GF features.Effectively reduces the influence of speaker pronunciation features and improves language recognition performance.

关 键 词:语种识别 归一化 语音重构 基音频率 神经网络 

分 类 号:TN912.3[电子电信—通信与信息系统]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象