基于非线性语谱图联合判决的语种识别  

Language identification based on joint decision of nonlinear spectrograms

在线阅读下载全文

作  者:段云 邵玉斌[1] 龙华[1] 杜庆治[1] DUAN Yun;SHAO Yubin;LONG Hua;DU Qingzhi(Faculty of Information Engineering and Automation,Kunming University of Science and Technology,Kunming 650504,China)

机构地区:[1]昆明理工大学信息工程与自动化学院,云南昆明650504

出  处:《微电子学与计算机》2024年第5期99-108,共10页Microelectronics & Computer

基  金:国家自然科学基金(61761025)。

摘  要:针对灰度对数语谱图对基频拉伸幅度过大,短时长语音识别率提升受限的问题,提出一种非线性语谱图联合判决的语种识别方法。首先,对语音进行能量归一化,提取对数功率谱,将频率刻度按照人耳听觉感知进行非线性映射得到非线性语谱图。然后,将非线性语谱图按词关联特性进行等间隔拆分,在ResNet网络后端加入联合判决层;输出语音所属语种类型。实验结果表明,所提方法有效改善灰度对数语谱图的缺点,识别性能均高于语谱图及改进特征。联合判决对切分时长为1.0 s的样本语音取得的识别效果最佳,在广播音频数据集中,识别率达到94.25%;在VoxForge公共语料集中,识别率达到98.94%。To address the problem that the gray-scale logarithmic speech spectrogram is too stretched to the fundamental frequency,which limits the improvement of short-length speech identification rate,a language identification method with joint judgment of nonlinear speech spectrogram is proposed.Firstly,the logarithmic power spectrum is extracted by energy normalization,and the nonlinear speech spectrogram is obtained by nonlinear mapping of frequency scales according to human ear perception.Then,the nonlinear speech spectrogram is split into equal intervals according to word association characteristics,and the joint judgment layer is added at the back end of the ResNet network.Finally,the language type of the speech is output.The experimental results show that the proposed method can effectively improve the shortcomings of the gray-scale logarithmic speech spectrogram,and the recognition performance is higher than that of the speech spectrogram and the improved features.The best recognition results are obtained for the sample speech with a cut time of 1.0 s,and the recognition rate reaches 94.25%in the broadcast audio data set and 98.94%in the VoxForge public corpus.

关 键 词:语种识别 语谱图 非线性 联合判决 神经网络 

分 类 号:TN912.3[电子电信—通信与信息系统]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象