Learning to compose diversified prompts for image emotion classification  

在线阅读下载全文

作  者:Sinuo Deng Lifang Wu Ge Shi Lehao Xing Meng Jian Ye Xiang Ruihai Dong 

机构地区:[1]Faculty of Information Technology,Beijing University of Technology,Beijing 100124,China [2]Insight Centre for Data Analytics,University College Dublin,Belfield,Dublin D04 V1W8,Ireland

出  处:《Computational Visual Media》2024年第6期1169-1183,共15页计算可视媒体(英文版)

基  金:supported in part by the National Natural Science Foundation of China under Grant Nos.62106010,61976010,62176011,62236010.

摘  要:Image emotion classification(IEC)aims to extract the abstract emotions evoked in images.Recently,language-supervised methods such as con-trastive language-image pretraining(CLIP)have demonstrated superior performance in image under-standing.However,the underexplored task of IEC presents three major challenges:a tremendous training objective gap between pretraining and IEC,shared suboptimal prompts,and invariant prompts for all instances.In this study,we propose a general framework that effectively exploits the language-supervised CLIP method for the IEC task.First,a prompt-tuning method that mimics the pretraining objective of CLIP is introduced,to exploit the rich image and text semantics associated with CLIP.Subsequently,instance-specific prompts are automatically composed,conditioning them on the categories and image content of instances,diversifying the prompts,and thus avoiding suboptimal problems.Evaluations on six widely used affective datasets show that the proposed method significantly outperforms state-of-the-art methods(up to 9.29%accuracy gain on the EmotionROI dataset)on IEC tasks with only a few trained parameters.The code is publicly available at https://github.com/dsn0w/PT-DPC/for research purposes.

关 键 词:image emotion analysis multimodal learning pretraining model prompt tuning 

分 类 号:TP3[自动化与计算机技术—计算机科学与技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象