一种融合视觉和听觉信息的双模态情感识别算法被引量：10

Emotion recognition based on visual and auditory information

作　　者：范习健杨绪兵[1] 张礼[1] 业巧林[1] 业宁[1] Fan Xijian;Yang Xubing;Zhang Li;Ye Qiaolin;Ye Ning(College of Information Science and Technology,Nanjing Forestry University,Nanjing,210037,China)

机构地区：[1]南京林业大学信息科学技术学院,南京210037

出　　处：《南京大学学报（自然科学版）》2021年第2期309-317,共9页Journal of Nanjing University（Natural Science）

基　　金：国家自然科学基金(61902187);辽宁省自然科学基金(2020⁃KF⁃22⁃04);南京市留学人员科技创新项目,江苏省双创人才计划。

摘　　要：语音信号和面部表情是人们表达情感的主要途径,也被认为是情感表达的两个主要模态,即听觉模态和视觉模态.目前情感识别的研究方法大多依赖单模态信息,但是单模态情感识别存在信息不全面、容易受噪声干扰等缺点.针对这些问题,提出一种融合听觉模态和视觉模态信息的两模态情感识别方法.首先利用卷积神经网络和预先训练好的面部表情模型,分别从语音信号和视觉信号中提取相应的声音特征和视觉特征;然后将提取的两类特征进行信息融合和压缩,充分挖掘模态间的相关信息;最后,利用长短期记忆循环神经网络对融合后的听觉视觉双模态特征进行情感识别.该方法能够有效地捕捉听觉模态和视觉模态间的内在关联信息,提高情感识别性能.利用RECOLA数据集对提出的方法进行验证,实验结果证明基于双模态的模型识别的效果比单个的图像或声音识别模型更好.Speech signals and facial expressions are the two main ways when people express their emotions.They are also considered to be the two main modals of emotional expression,i.e.,auditory modality and visual modality.Most of the current methods of emotion recognition research rely on the use of single⁃modal information,but single modal based methods have the disadvantages of incomplete information and vulnerability to noise interference.To address the problems of emotion recognition based on single modal,this paper proposes a bi⁃modal based emotion recognition method that combines auditory modality and visual modal information.Firstly,the Convolutional Neural Network and the pre⁃trained facial expression model are used respectively.The corresponding sound features and visual features are extracted from the speech signal and the visual signal.The extracted two types of features are information fusion and compression,and the relevant information between the modes is fully mined.Finally,the recurrent neural network is used to recognize emotion recognition on the fused auditory visual bimodal features.The method can effectively capture the intrinsic association information between the auditory modality and the visual modality,thereby improve the emotion recognition performance.In this paper,the proposed bimodal identification method is validated by RECOLA dataset.The experimental results show that the model recognition effect based on bimodal is better than a single image or voice recognition model.

关键词：情感识别特征融合卷积神经网络长短期记忆

分类号：TP391.4[自动化与计算机技术—计算机应用技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

一种融合视觉和听觉信息的双模态情感识别算法被引量：10

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

一种融合视觉和听觉信息的双模态情感识别算法 被引量：10

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

一种融合视觉和听觉信息的双模态情感识别算法被引量：10