融合图像显著性的声波动方程情感识别模型  被引量:1

An Acoustic Wave Equation Emotion Recognition Model Based on Image Saliency

在线阅读下载全文

作  者:贾宁[1] 郑纯军[1] JIA Ning;ZHENG Chunjun(School of Software,Dalian Neusoft University of Information,Dalian 116023,China)

机构地区:[1]大连东软信息学院软件学院,大连116023

出  处:《数据采集与处理》2021年第5期1062-1072,共11页Journal of Data Acquisition and Processing

摘  要:语音情感识别(Speech emotion recognition,SER)是计算机理解人类情感的关键之处,也是人机交互的重要组成部分。当情感语音信号在不同的介质传播时,使用深度学习模型获得的识别精度不高,识别模型的迁移能力不强。为此,设计了一种融合图像显著性和门控循环的声波动方程情感识别(Image saliency gated recurrent acoustic wave equation emotion recognition,ISGR-AWEER)模型,该模型由图像显著性提取和基于门控循环的声波动模型构成。前者模拟注意力机制,用于提取语音中情感表达的有效区域,后者设计了一个声波动情感识别模型,该模型模拟循环神经网络的流程,可以有效提升跨介质下语音情感识别的精度,同时可快速地实现跨介质下的模型迁移。通过实验,在交互情感二元动作捕捉(Interactive emotional dyadic motion capture,IEMOCAP)情感语料库和自建多介质情感语音语料库上验证了当前模型的有效性,与传统的循环神经网络相比,情感识别精度获得了25%的改善,并且具有较强的跨媒介迁移能力。Speech emotion recognition(SER)is the key point for computer to understand human emotion,and it is also important in human-computer interaction.When the emotional speech signal transforms in the different media,the recognition accuracy of traditional deep learning model is not high enough,and the migration ability is not strong.Here,an acoustic wave equation emotion recognition model,i.e.,image saliency gated recurrent acoustic wave equation emotion recognition(ISGR-AWEER)model is designed.The model is composed of image saliency extraction and gated recurrent model.The first part simulates the attention mechanism,which is used to extract the salient regions in speech.An acoustic wave equation emotion recognition model is designed.The model simulates the recurrent neural network,which can effectively improve the accuracy of SER in cross-media,and can quickly realize the model migration in cross-media.The effectiveness of the current model is verified by the experiments on the interactive emotional dynamic motion capture emotional corpus and the self-built multi-media emotional speech corpus.Compared with recurrent neural network,the accuracy of emotion recognition is improved by25%,and it has a strong ability of cross-media migration.

关 键 词:语音情感识别 图像显著性和门控循环的声波动方程情感识别 图像显著性 声波动方程 门控循环 多介质情感语音语料库 

分 类 号:TP183[自动化与计算机技术—控制理论与控制工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象