基于超图的多模态情绪识别  

Multi-modal Emotion Recognition Based on Hypergraph

在线阅读下载全文

作  者:宗林林 周佳慧 谢秋婕 张宪超[1] 徐博[3] ZONG Lin-Lin;ZHOU Jia-Hui;XIE Qiu-Jie;ZHANG Xian-Chao;XU Bo(Department of Software,Dalian University of Technology,Dalian,Liaoning 116000;School of Computer Science and Technology,Fudan University,Shanghai 200433;School of Computer Science and Technology,Dalian University of Technology,Dalian,Liaoning 116000)

机构地区:[1]大连理工大学软件学院,辽宁大连116000 [2]复旦大学计算机科学技术学院,上海200433 [3]大连理工大学计算机科学与技术学院,辽宁大连116000

出  处:《计算机学报》2023年第12期2520-2534,共15页Chinese Journal of Computers

基  金:国家自然科学基金(No.62006034);大连市青年科技之星项目(2021RQ056)资助。

摘  要:近年来多模态情绪识别获得广泛关注,模态间的特征融合决定了情绪识别的效果,现有基于图的情绪特征融合方法多基于二元关系图,在处理三种及以上模态数据时难以实现有效的模态间特征融合,限制了多模态情绪识别的效果.为解决该问题,本文提出基于超图的多模态情绪识别模型(Multi-modal Emotion Recognition Based on Hypergraph,MORAH),引入超图来建立多模态的多元关系,以此替代现有图结构采用的多个二元关系,实现更加充分、高效的多模态特征融合.具体来说,该模型将多模态特征融合分为两个阶段:超边构建阶段和超图学习阶段.在超边构建阶段,通过胶囊网络实现对序列中每个时间步的信息聚合,并建立单模态的图,然后使用图卷积进行第二次信息聚合,并以此作为下一阶段建立超图的基础,得益于图胶囊聚合方法的加入,MORAH可以同时处理对齐数据和未对齐数据,无需手动对齐;在超图学习阶段,模型建立同一样本不同模态节点之间的关联,以及同类样本所有模态之间的关联,同时,在超图卷积过程中,使用分层多级超边来避免过于平滑的节点嵌入,并使用简化的超图卷积方法来融合模型之间的高级特征,以确保所有节点特征仅在必要时更新.在两个基准数据集上的综合实验表明,本文模型利用超图实现了对多模态数据之间多元关系的充分利用.与现有的先进方法相比,在CMU-MOSI数据集的未对齐数据上,MORAH将二分类准确率提高了1.3%,F1得分提高了1.1%.在CMU-MOSEI数据集的未对齐数据上,MORAH将二分类准确率和F1分数分别提高了0.2%.With the rapid progress of artificial intelligence technology,machines need to recognize users’emotions to provide users with a better human-computer interaction experience.Therefore,emotion recognition has become one of the active fields of artificial intelligence.Traditional emotion recognition is mostly based on text modality.Compared with single modality,multi-modal emotion recognition has the advantages of data complementarity and model robustness.In multi-modal emotion recognition,feature fusion between modalities determines the effect of emotion recognition.Recently,graph-based intra-modality fusion has attracted much attention of related research,which uses graphs of binary relationships between two modalities.When processing data of three or more modalities,the graph can hardly effectively establish the feature fusion between all modalities without introducing redundant information,limiting the performance of multi-modal emotion recognition.Therefore,it is necessary to design more effective method to model and fuse multi-modal emotion features.To solve this problem,this paper proposes an emotion recognition model Multi-modal Emotion Recognition Based on Hypergraph(MORAH)which introduces hypergraph to establish multivariate relations among multi-modal data instead of binary relations and achieves efficient multi-modal feature fusion.Specifically,the model divides multi-modal feature fusion into two stages:the hyperedge construction stage and the hypergraph learning stage.In the hyperedge construction stage,we aggregate the information of each time step in the sequence through the capsule network and establish the graph of a single modality.Then,we use graph convolution for the second aggregation,which is used as the basis for establishing hypergraph in the next stage.Benefiting from the graph capsule aggregation method,the model can work with aligned data and unaligned data at the same time,without manual alignment of unaligned data.In the hypergraph learning stage,we not only establish the association

关 键 词:情绪识别 多模态学习 超图学习 超边扩展 胶囊网络 

分 类 号:TP311[自动化与计算机技术—计算机软件与理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象