检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:李烽源 蔺素珍[1] 王彦博 李大威 顾梦瑶 LI Fengyuan;LIN Suzhen;WANG Yanbo;LI Dawei;GU Mengyao(School of Computer Science and Technology,North University of China,Taiyuan 030051,China;School of Electrical and Control Engineering,North University of China,Taiyuan 030051,China)
机构地区:[1]中北大学计算机科学与技术学院,山西太原030051 [2]中北大学电气与控制工程学院,山西太原030051
出 处:《中北大学学报(自然科学版)》2025年第1期1-9,共9页Journal of North University of China(Natural Science Edition)
基 金:国家自然科学基金项目(62271453);山西省自然科学基金项目(202303021211147);山西省应用基础研究计划(20210302123025);山西省知识产权局专利转化专项计划(202302001)。
摘 要:现有的多模态情感分析研究主要通过不同模态特征的整体交互进行不同模态信息的融合,未考虑不同模态包含的独有特征以及共有特征之间的联系,导致无法有效分析复杂的情感。针对上述问题,本文提出了自适应门控解耦特征融合的多模态情感分析模型(AGDF)。首先,利用预训练的BERT模型和Transformer模型进行不同模态的特征提取。其次,根据不同模态的共有特征相似而独有特征不相似的原理构造对比对,通过对比学习的方法,将不同模态的特征分解为独有特征和共有特征。然后,根据图像和语音模态在文本模态存在偏移的原理,设计了一种新的自适应门控机制进行特征融合,将其他模态信息融于文本模态。同时,设计了独有特征和共有特征的联系图,利用图注意力神经网络进行融合,以平衡模态之间的独有信息和共有信息。最后,对融合特征进行分类。在数据集CMU-MOSI、CMU-MOSEI上进行了实验,结果显示本文方法比基线方法在准确率和F1分数上均提高了约1百分点。此外,与其他特征分解方法相比,本文方法的准确率提高了1.23百分点,F1分数提高了1.37百分点,Corr提高了2.13百分点,MAE降低了4.83百分点。综合结果表明,本文提出的方法能够更加充分利用不同模态的异质信息,从而有效提高情感识别的效果。In the existing research on multi-modal sentiment analysis,the fusion different modal information is mainly through the overall interaction of different modal features,but it doesn’t consider the relationship between unique features and common features contained in different modes,so the complex emotions can’t be analyzed effectively.To solve this problem,a multimodal sentiment analysis model based on adaptive gated decoupling feature fusion(AGDF)was proposed.Firstly,the pre-trained BERT model and Transformer model were used for feature extraction of different modes.Secondly,according to the principle that the common features of different modes were similar but the unique features were not similar,the contrast pair was constructed.By contrastive learning,the features of different modes were decomposed into unique features and common features.Thirdly,according to the principle that the image and speech modes were offset in the text mode,a new adaptive gating mechanism was designed to fuse the features and integrate other modal information into the text mode.At the same time,the relation graph of unique features and common features was designed,and the fusion of the graph attention neural network was used to balance the unique information and common information among the modes.Finally,the fusion features were classified.Experiments on the datasets CMUMOSI and CMU-MOSEI show that the accuracy and F1 score of the proposed method are improved by about 1 percentage point compared with the baseline method.In addition,compared with other feature decomposition methods,the proposed method improves accuracy by 1.23 percentage point,F1 score by 1.37 percentage point,Corr by 2.13 percentage point,and reduces MAE by 4.83 percentage point.Consequently,the proposed method can make full use of the heterogeneous information of different modes and effectively improve the effect of sentiment analysis.
关 键 词:情感分析 对比学习 图神经网络 多模态信息融合 自适应门控
分 类 号:TP391.1[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:18.117.101.130