基于对比预测编码模型的多任务学习语种识别方法  被引量:1

Language Identification Method for Multi-task Learning Based on Contrastive Predictive Coding Model

在线阅读下载全文

作  者:赵建川 杨浩铨[3] 徐勇 吴恋 崔忠伟 ZHAO Jianchuan;YANG Haoquan;XU Yong;WU Lian;CUI Zhongwei(School of Mathematics and Big Data,Guizhou Education University,Guiyang 550018,China;Big Data Science and Intelligent Engineering Research Institute,Guizhou Education University,Guiyang 550018,China;School of Computer Science and Technology,Harbin Institute of Technology(Shenzhen),Shenzhen 518000,China)

机构地区:[1]贵州师范学院数学与大数据学院,贵阳550018 [2]贵州师范学院大数据科学与智能工程研究院,贵阳550018 [3]哈尔滨工业大学(深圳)计算机科学与技术学院,深圳518000

出  处:《数据采集与处理》2022年第2期288-297,共10页Journal of Data Acquisition and Processing

基  金:贵州省科技厅基础研究计划项目(黔科合基础-ZK[2021]一般334);贵州省教育厅基础研究计划项目(黔科合基础[2020]1Y258);贵州省教育厅创新群体研究项目(黔教合KY字[2021]022);贵州省省级重点学科“计算机科学与技术”项目(ZDXK[2018]007号);贵州省2018年第三批省级服务业发展引导资金项目(黔发改服务[2018]1181号)。

摘  要:语种识别的关键是从语音片段中提取有用的特征。通过延时神经网络(Time-delayed neural network,TDNN)可以提取包含丰富上下文信息的特征向量,有效提高系统性能。本文提出一种ECAPA(Emphasized channel attention)-TDNN+对比预测编码(Contrastive predictive coding,CPC)模型的多任务学习语种识别网络。ECAPA-TDNN为主干网络,提取语音全局特征,改进的CPC模型为辅助网络,对ECAPA-TDNN提取的帧级特征进行对比预测学习,通过联合损失函数进行优化训练。在东方语种竞赛数据集AP17-OLR的10类语种上进行了实验。实验结果表明,本文提出的网络在1 s,3 s和全长(All)测试集测得的识别准确率相比于基础网络都有明显的提高。The key of language identification is to extract useful features from speech fragments.The time-delayed neural network(TDNN)can extract feature vectors,which contain rich context and improve system performance effectively.This paper proposes a multi-task learning method of ECAPA(Emphasized channel attention)-TDNN+contrastive predictive coding(CPC)network for language identification.ECAPA-TDNN is the main network to extract the global features of language.The improved CPC model is the auxiliary network,and the frame level features extracted by ECAPA-TDNN are compared and predicted.Finally,the joint loss function is used to optimize the network.The proposed method is tested on the 10 language data sets provided by the AP17-OLR data set.The result shows that the identification accuracy of the proposed network is higher than baseline on the 1 s,3 s and All test data sets of AP17-OLR.

关 键 词:语种识别 对比预测编码 多任务学习 ECAPA-TDNN 联合损失 

分 类 号:TN912.34[电子电信—通信与信息系统]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象