检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:范习健 杨绪兵[1] 张礼[1] 业巧林[1] 业宁[1] Fan Xijian;Yang Xubing;Zhang Li;Ye Qiaolin;Ye Ning(College of Information Science and Technology,Nanjing Forestry University,Nanjing,210037,China)
机构地区:[1]南京林业大学信息科学技术学院,南京210037
出 处:《南京大学学报(自然科学版)》2021年第2期309-317,共9页Journal of Nanjing University(Natural Science)
基 金:国家自然科学基金(61902187);辽宁省自然科学基金(2020⁃KF⁃22⁃04);南京市留学人员科技创新项目,江苏省双创人才计划。
摘 要:语音信号和面部表情是人们表达情感的主要途径,也被认为是情感表达的两个主要模态,即听觉模态和视觉模态.目前情感识别的研究方法大多依赖单模态信息,但是单模态情感识别存在信息不全面、容易受噪声干扰等缺点.针对这些问题,提出一种融合听觉模态和视觉模态信息的两模态情感识别方法.首先利用卷积神经网络和预先训练好的面部表情模型,分别从语音信号和视觉信号中提取相应的声音特征和视觉特征;然后将提取的两类特征进行信息融合和压缩,充分挖掘模态间的相关信息;最后,利用长短期记忆循环神经网络对融合后的听觉视觉双模态特征进行情感识别.该方法能够有效地捕捉听觉模态和视觉模态间的内在关联信息,提高情感识别性能.利用RECOLA数据集对提出的方法进行验证,实验结果证明基于双模态的模型识别的效果比单个的图像或声音识别模型更好.Speech signals and facial expressions are the two main ways when people express their emotions.They are also considered to be the two main modals of emotional expression,i.e.,auditory modality and visual modality.Most of the current methods of emotion recognition research rely on the use of single⁃modal information,but single modal based methods have the disadvantages of incomplete information and vulnerability to noise interference.To address the problems of emotion recognition based on single modal,this paper proposes a bi⁃modal based emotion recognition method that combines auditory modality and visual modal information.Firstly,the Convolutional Neural Network and the pre⁃trained facial expression model are used respectively.The corresponding sound features and visual features are extracted from the speech signal and the visual signal.The extracted two types of features are information fusion and compression,and the relevant information between the modes is fully mined.Finally,the recurrent neural network is used to recognize emotion recognition on the fused auditory visual bimodal features.The method can effectively capture the intrinsic association information between the auditory modality and the visual modality,thereby improve the emotion recognition performance.In this paper,the proposed bimodal identification method is validated by RECOLA dataset.The experimental results show that the model recognition effect based on bimodal is better than a single image or voice recognition model.
分 类 号:TP391.4[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.200