基于知识增强的跨模态融合网络的多模态对话情绪识别模型

KCF:knowledge-enhanced cross-modal fusion network foremotion recognition in conversation

作　　者：干欣怡黄贤英[1] 邹世豪沈旭东 Gan Xinyi;Huang Xianying;Zou Shihao;Shen Xudong(College of Computer Science&Engineering,Chongqing University of Technology,Chongqing 400054,China;School of Computer Science&Technology,Huazhong University of Science&Technology,Wuhan 430074,China)

机构地区：[1]重庆理工大学计算机科学与工程学院,重庆400054 [2]华中科技大学计算机科学与技术学院,武汉430074

出　　处：《计算机应用研究》2025年第4期1065-1072,共8页Application Research of Computers

基　　金：国家自然科学基金资助项目(62141201);重庆市自然科学基金资助项目(CSTB2022NSCQ-MSX1672);重庆市研究生科研创新资助项目(CYS23675)。

摘　　要：针对未充分利用模态表征能力的差异和说话者情绪线索的问题,提出了一种基于知识增强的跨模态融合网络模型。该模型设计了外部知识增强的跨模态模块,将较弱模态特征与多层次文本和外部知识逐层融合嵌入到多头注意力层中,充分挖掘较弱模态中的有效信息,实现模态间的特征互补和一致性。此外,模型还设计了基于有向图的情绪线索增强模块,利用基于说话者不同情绪线索的外部知识来增强融合特征,并构建上下文信息有向图,深入挖掘并利用说话者的情绪线索。实验结果表明,该模型在两个基准数据集中有效利用了模态表征能力的差异和说话者情绪线索,情绪识别效果显著优于现有方法,验证了模型的可行性与有效性。To address the underutilization of differences in modal representation capabilities and speaker emotional cues,this paper proposed a knowledge-enhanced cross-modal fusion network model.This model incorporated a cross-modal module enhanced by external knowledge,which systematically integrated weaker modal features with multi-level text and external know-ledge,embedding them into the multi-head attention layer.This approach fully extracted valuable information from the weaker modalities,ensuring feature complementarity and consistency across modalities.Additionally,the model introduced an emotion clue enhancement module based on a directed graph,which leveraged external knowledge linked to the speaker’s emotional cues to strengthen the fused features.This module also constructed a directed graph to capture contextual information,allowing for a deeper exploration and utilization of the speaker’s emotional states.Experimental results on two benchmark datasets de-monstrate that the model effectively harnesses both modal representation differences and speaker emotional cues,achieving significantly improved emotion recognition performance compared to existing methods,thereby validating the model’s feasibility and effectiveness.

关键词：对话情绪识别外部知识数据增强 TRANSFORMER 多模态交互

分类号：TP391[自动化与计算机技术—计算机应用技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于知识增强的跨模态融合网络的多模态对话情绪识别模型

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于知识增强的跨模态融合网络的多模态对话情绪识别模型

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索