MCM-ICE:联合独立编码和协同编码的多模态分类模型  

MCM-ICE:Multimodal Classification Model with Independent-encoding and Co-encoding

在线阅读下载全文

作  者:郭锐锋[1,2] 魏靖烜 于碧辉 孙林壮[1,2] GUO Ruifeng;WEI Jingxuan;YU Bihui;SUN Linzhuang(Shenyang Institute of Computing Technology,University of Chinese Academy of Sciences,Shenyang 110168,China;University of Chinese Academy of Sciences,Beijing 110049,China)

机构地区:[1]中国科学院沈阳计算技术研究所,沈阳110168 [2]中国科学院大学,北京110049

出  处:《小型微型计算机系统》2024年第9期2080-2086,共7页Journal of Chinese Computer Systems

基  金:国家重点研发计划项目(2019YFB1405803)资助.

摘  要:多模态数据处理是一个重要的研究领域,它可以通过结合文本、图像等多种信息来提高模型性能.然而,由于不同模态之间的异构性以及信息融合的挑战,设计有效的多模态分类模型仍然是一个具有挑战性的问题.本文提出了一种新的多模态分类模型——MCM-ICE,它通过联合独立编码和协同编码策略来解决特征表示和特征融合的挑战.MCM-ICE在Fashion-Gen和Hateful Memes Challenge两个数据集上进行了实验,结果表明该模型在这两项任务中均优于现有的最先进方法.本文还探究了协同编码模块Transformer输出层的不同向量选取对结果的影响,结果表明选取[CLS]向量和去除[CLS]的向量的平均池化向量可以获得最佳结果.消融研究和探索性分析支持了MCM-ICE模型在处理多模态分类任务方面的有效性.The field of multimodal data processing is significant in improving model performance through the combination of diverse information types like text and images.However,designing effective multimodal classification models remains a challenging problem due to the heterogeneity between different modalities and the complexity of information fusion.In this paper,we propose a new multimodal classification model,MCM-ICE,which addresses the challenges of feature representation and fusion through joint independent and collaborative encoding strategies.The effectiveness of MCM-ICE was evaluated on two datasets,Fashion-Gen and Hateful Memes Challenge,and results show that the model outperforms existing state-of-the-art methods in both tasks.Besides,We explored the impact of different vector selections from the Transformer output layer of the collaborative encoding module on the results,revealing that the optimal outcomes are obtained by selecting the[CLS]vector and the average pooling vector without[CLS].Finally,the validity of the MCM-ICE model in managing multimodal classification tasks was reinforced by our ablation study and exploratory analysis.

关 键 词:多模态数据处理 特征表示 特征融合 协同编码 

分 类 号:TP18[自动化与计算机技术—控制理论与控制工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象