基于多模态大模型和思维链的多标签图像情绪预测

Multi-label Image Emotion Prediction Based on Multimodal Large Language Model and Chain of Thought

作　　者：王冰冰郭兴文梁斌[3] 徐睿峰 WANG Bingbing;GUO Xingwen;LIANG Bin;XU Ruifeng(Harbin Institute of Technology(Shenzhen),Shenzhen Guangdong 518055,China;No.30 Institute of CETC,Chengdu Sichuan 610041,China;The Chinese University of Hong Kong,Hong Kong 999077,China)

机构地区：[1]哈尔滨工业大学(深圳),广东深圳518055 [2]中国电子科技集团公司第三十研究所,四川成都610041 [3]香港中文大学,中国香港999077

出　　处：《信息安全与通信保密》2025年第3期2-11,共10页Information Security and Communications Privacy

基　　金：国家自然科学基金面上项目(62176076)。

摘　　要：图像情绪预测的目的是预测图像将激发观看者产生的情绪。相关研究对社交媒体舆情分析、网络空间综合治理等具有重要意义。针对现有方法进行情绪分类时往往忽略图像可能引发不同情绪的问题,提出一种基于多模态大模型和思维链框架的多标签图像情绪预测模型。该方法基于多模态大模型,设计思维链,通过图像描述生成、情绪文本衍生和图像文本匹配3个推理步骤,提高了多标签图像情绪预测的准确性和可解释性。在Emotion6和EMOTIC这2个公开数据集上的实验结果显示,该方法展现出优异的性能。Image emotion prediction aims to predict the emotions that images will stimulate the viewers.This area of research is of significant importance for social media sentiment analysis and comprehensive governance of cyberspace.Existing methods for emotion classification often overlook the issue that images may elicit different emotions.To address this problem,this paper proposes a MIEP-CoT(Multi-label Image Emotion Prediction based on Chain of Thought)model.The method is based on multimodal large language model and designing the chain of thought.It aims to improve the accuracy and interpretability of multi-label image emotion prediction through three inference steps:image description generation,emotion text derivation and image text matching.The experimental results on two publicly available datasets,Emotion6 and EMOTIC,indicate that the proposed method demonstrates significant performance.

关键词：图像情绪预测思维链多模态大模型多标签学习

分类号：TP3-0[自动化与计算机技术—计算机科学与技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于多模态大模型和思维链的多标签图像情绪预测

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于多模态大模型和思维链的多标签图像情绪预测

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索