Conditional selection with CNN augmented transformer for multimodal affective analysis  

在线阅读下载全文

作  者:Jianwen Wang Shiping Wang Shunxin Xiao Renjie Lin Mianxiong Dong Wenzhong Guo 

机构地区:[1]College of Computer and Data Science,Fuzhou University,Fuzhou,China [2]College of Computer and Cyber Security,Fujian Normal University,Fuzhou,China [3]Key Laboratory of Network Computing and Intelligent Information Processing,Fuzhou University,Fuzhou,China [4]Digital Fujian Institute of Big Data Security Technology,Fujian Normal University,Fuzhou,China [5]Department of Sciences and Informatics,Muroran Institute of Technology,Muroran,Japan

出  处:《CAAI Transactions on Intelligence Technology》2024年第4期917-931,共15页智能技术学报(英文)

基  金:National Key Research and Development Plan of China, Grant/Award Number: 2021YFB3600503;National Natural Science Foundation of China, Grant/Award Numbers: 62276065, U21A20472。

摘  要:Attention mechanism has been a successful method for multimodal affective analysis in recent years. Despite the advances, several significant challenges remain in fusing language and its nonverbal context information. One is to generate sparse attention coefficients associated with acoustic and visual modalities, which helps locate critical emotional se-mantics. The other is fusing complementary cross‐modal representation to construct optimal salient feature combinations of multiple modalities. A Conditional Transformer Fusion Network is proposed to handle these problems. Firstly, the authors equip the transformer module with CNN layers to enhance the detection of subtle signal patterns in nonverbal sequences. Secondly, sentiment words are utilised as context conditions to guide the computation of cross‐modal attention. As a result, the located nonverbal fea-tures are not only salient but also complementary to sentiment words directly. Experi-mental results show that the authors’ method achieves state‐of‐the‐art performance on several multimodal affective analysis datasets.

关 键 词:affective computing data fusion information fusion multimodal approaches 

分 类 号:TP311[自动化与计算机技术—计算机软件与理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象