采用多任务学习和循环神经网络的语音情感识别算法  被引量:19

Speech Emotion Recognition Algorithm Based on Multi Task Learning and Recurrent Neural Network

在线阅读下载全文

作  者:冯天艺 杨震[1,2] Feng Tianyi;Yang Zhen(Key Lab of Broadband Wireless Communication and Sensor Network Technology,Ministry of Education,Nanjing University of Posts and Telecommunications,Nanjing,Jiangsu 210003,China;National Local Joint Engineering Research Center for Communications and Network Technology,Nanjing University of Posts and Telecommunications,Nanjing,Jiangsu 210003,China)

机构地区:[1]南京邮电大学宽带无线通信与传感网技术教育部重点实验室,江苏南京210003 [2]南京邮电大学通信与网络技术国家地方联合工程研究中心,江苏南京210003

出  处:《信号处理》2019年第7期1133-1140,共8页Journal of Signal Processing

基  金:国家“863”高技术研究发展计划项目(2006AA010102)

摘  要:随着机器学习的快速发展,许多研究者使用神经网络来解决语音识别领域中的各类问题。然而由于训练数据有限等原因,常规的神经网络分类器普遍存在泛化误差等问题。为了解决此问题,迁移学习中的多任务学习被引入到研究中。本文提出了一种采用多任务学习和循环神经网络的语音情感识别算法(MTL-RNN),将说话人情感识别作为主任务,性别识别和身份识别作为辅助任务,三个任务在神经网络中并行训练。算法模型通过RNN共享层共享网络参数、学习共享特征,通过属性依赖层学习独有特征,以提升模型的分类性能。实验结果表明,本文所提出的MTL-RNN算法在汉语和阿拉伯语、较少说话人和较多说话人的场景下均有较好的识别性能。With the rapid development of machine learning,more and more researchers utilize neural networks to tackle multifarious issues existing in the domain of speech recognition.However,in virtue of various reasons like the restricted training data,most of conventional neural network classifiers are with the flaws such as generalization error and so on.In order to solve this problem,multi-task learning belonging to transfer learning has been studied actively nowadays.Based upon multi-task learning and cyclic neural network,this paper proposes a speech emotion recognition algorithm(MTL-RNN)which takes emotion recognition as the main task,gender and identity recognition as auxiliary tasks.On this basis,the three tasks are trained simultaneously in the neural network.Aiming at learning the sharing features and improving the classification performance of the model,the algorithm model shares network parameters through RNN sharing layers and studies unique features through the attribute-dependent layers.Experiments show that the MTL-RNN algorithm proposed in this paper has good recognition performance in the language environment of both Chinese and Arabic.Furthermore,it also works well not only in the experiment containing a few speakers but also in the other one with relatively more speakers.

关 键 词:语音情感识别 多任务学习 循环神经网络 

分 类 号:TN912.34[电子电信—通信与信息系统]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象