检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:冯勇[1] 申锦涛 徐红艳[1] 王嵘冰[1] 刘婷婷 张永刚[2] Feng Yong;Shen Jintao;Xu Hongyan;Wang Rongbing;Liu Tingting;Zhang Yonggang(College of Information,Liaoning University,Shenyang 110036,China;Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education,Jilin University,Changchun 130012,China)
机构地区:[1]辽宁大学信息学院,沈阳110036 [2]吉林大学符号计算与知识工程教育部重点实验室,长春130012
出 处:《数据分析与知识发现》2025年第3期16-27,共12页Data Analysis and Knowledge Discovery
基 金:教育部重点实验室资助项目(项目编号:93K172018K01);辽宁省教育厅科学研究基金面上项目(项目编号:LJKMZ20020447);辽宁省科学技术厅项目(项目编号:2023JH4/10700056)的研究成果之一。
摘 要:【目的】针对现有情感分析研究中多模态数据融合不充分且很少考虑模态间异构性的影响,从而导致情感分类准确率不高的问题,提出一种基于Translate机制的交叉融合多模态情感分析模型。【方法】首先通过Translate机制实现文本、图像和音频模态特征间的相互转换;然后将转换后的模态特征与目标模态特征进行融合(单模态融合),从而避免不同模态的异构性对模型性能的影响;最后使用交叉融合将不同模态特征充分交互,生成充分学习单模态信息的多模态特征,进而输入分类器中进行情感分类。【结果】在CMU-MOSI和CMU-MOSEI公开数据集上与当前主流情感分析模型进行对比实验。相较于次优模型,所提模型的准确率和F1值分别提升0.96和1.00个百分点。【局限】多模态数据中各模态对于情感分析的贡献度不同,模型没有特别考虑图像和音频模态贡献度高于文本模态的场景。【结论】所提模型充分融合了模态间信息,避免了模态间异构性的影响,能够有效提升模型整体性能。[Objective]Due to the insufficient fusion of multimodal data and the lack of consideration for the impact of intermodal heterogeneity in existing sentiment analysis research,the accuracy of sentiment classification is not high,a cross fusion multimodal sentiment analysis model based on the Translate mechanism is proposed.[Methods]Firstly,the Translate mechanism is employed to achieve mutual transformation between text,image,and audio modality features.Subsequently,the transformed modality features are fused with the target modality features(unimodal fusion)to mitigate the impact of intermodal heterogeneity on the model performance.Finally,cross-modal fusion is used to enable comprehensive interaction among different modality features,generating multimodal features that effectively capture unimodal information for sentiment classification via a classifier.[Results]Comparative experiments with current mainstream sentiment analysis models are conducted on the CMU-MOSI and CMU-MOSEI public datasets.The results show that the proposed model achieves a 0.96%improvement in accuracy and a 1.00%improvement in F1-Score compared to suboptimal models.[Limitations]The contribution of each modality to sentiment analysis varies in multimodal data,and the model does not specifically consider scenarios where the contribution of the image and audio modalities is higher than that of the text modalities.[Conclusions]The proposed model fully integrates intermodal information,avoids the influence of intermodal heterogeneity,and can effectively improve the overall performance.
关 键 词:情感分析 多模态 模态转换 特征融合 TRANSLATE
分 类 号:TP302[自动化与计算机技术—计算机系统结构] G202[自动化与计算机技术—计算机科学与技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.49