WMA:A Multi-Scale Self-Attention Feature Extraction Network Based on Weight Sharing for VQA  被引量:1

在线阅读下载全文

作  者:Yue Li Jin Liu Shengjie Shang 

机构地区:[1]Shanghai Maritime University,Shanghai,201306,China

出  处:《Journal on Big Data》2021年第3期111-118,共8页大数据杂志(英文)

基  金:This work is supported by the National Natural Science Foundation of China(61872231,61701297).

摘  要:Visual Question Answering(VQA)has attracted extensive research focus and has become a hot topic in deep learning recently.The development of computer vision and natural language processing technology has contributed to the advancement of this research area.Key solutions to improve the performance of VQA system exist in feature extraction,multimodal fusion,and answer prediction modules.There exists an unsolved issue in the popular VQA image feature extraction module that extracts the fine-grained features from objects of different scale difficultly.In this paper,a novel feature extraction network that combines multi-scale convolution and self-attention branches to solve the above problem is designed.Our approach achieves the state-of-the-art performance of a single model on Pascal VOC 2012,VQA 1.0,and VQA 2.0 datasets.

关 键 词:VQA feature extraction self-attention FINE-GRAINED 

分 类 号:TP3[自动化与计算机技术—计算机科学与技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象