检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:刘浠辰 姜囡 杜扶遥 LIU Xi-chen;JIANG Nan;DU Fu-yao(College of Public Security Information Technology and Intelligence,Criminal Investigation Police University of China,Shenyang Liaoning 110854,China;Key Laboratory of Evidence Science,Ministry of Education,China University of Political Science and Law,Beijing 100088,China)
机构地区:[1]中国刑事警察学院公安信息技术与情报学院,辽宁沈阳110854 [2]中国政法大学证据科学教育部重点实验室,北京100088
出 处:《计算机仿真》2025年第2期215-220,共6页Computer Simulation
基 金:公安学科基础理论研究创新计划项目(安全防范技术与工程基础理论与学科体系研究2022XKGJ0110);辽宁省科技厅联合开放基金机器人学国家重点实验室开放基金资助项目(2020-KF-12-11);证据科学教育部重点实验室(中国政法大学)开放基金资助课题(2021KFKT09);中央高校基本科研业务费专项资金资助(3242019010);辽宁省自然科学基金项目(2019-ZD-0168);教育部重点研究项目(E-AQGABQ20202710)。
摘 要:针对语音单模态情感识别特征缺失等问题,提出了一种基于语音和视频动态特征融合的双模态情感识别方法,解决了基于图像静态特征进行情感识别导致时序特征缺失的问题。由于视频中人体动作能够充分反映情绪特征,重点提取了人体动作的深层特征作为视频动态特征。调整MFCC系数数量,进行语音特征数量对情感识别的差异性影响分析。基于MFCC和基频混合特征输人双向LSTM网络获取语音深层特征。基于IEMOCAP数据集,将两种单模态特征情感识别与所提出的双模态情感识别方法进行对比分析。结果表明,所提出的双模态动态特征方法识别率分别提高了9.6%和21.1%,当MFCC系数数量优化为40时,识别率均有显著提高。A bimodal emotion recognition method based on the fusion of speech and video dynamic features is proposed to solve the problem of feature missing in speech monomodal emotion recognition.And solve that problem of time sequence feature loss cause by emotion recognition based on image static features.Because the human action in the video can fully reflect the emotional characteristics,the deep features of human action are extracted as the dynamic features of the video.The number of MFCC coefficients is adjusted to analyze the different influence of the number of speech features on emotion recognition.Acquire speech deep feature based on MFCC and pitch mix feature inputting bidirectional LSTM network.Based on the IEMOcap data set,the two kinds of single-modal feature emotion recognition and the proposed bimodal emotion recognition method are compared and analyzed.The results show that the recognition rates of the proposed dual-mode dynamic feature method are increased by 9.6%and 21.1%,respectively.The results show that the proposed bimodal dynamic feature method has improved recognition rates by 9.6%and 21.1%,respectively.When the number of MFCC coefficients is optimized to 40,the recognition rates are significantlyimproved.
关 键 词:双模态视频动态特征 语音特征 特征融合 情感识别
分 类 号:TP391.9[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.147.67.245