基于层次注意力机制的高效视觉问答模型被引量：9

Efficient image question answering model based on layered attention mechanism

作　　者：吝博强田文洪[1] Lin Boqiang;Tian Wenhong(School of Information&Engineering,University of Electronic Science&Technology of China,Chengdu 610054,China)

机构地区：[1]电子科技大学信息与软件工程学院,成都610054

出　　处：《计算机应用研究》2021年第2期636-640,共5页Application Research of Computers

基　　金：国家自然科学基金资助项目(61672136,61828202)。

摘　　要：视觉问答(visual question answering,VQA)是深度学习领域的一个新挑战,需要模型同时根据问题的语义和图片的内容进行推理并给出正确答案。针对视觉问答图片输入的多样性,设计了一种由两层注意力机制堆叠组成的层次注意力机制,帮助模型定位图片中与问题相关的信息,其中第一层注意力机制使用目标检测网络提取图片中物体的特征,第二层注意力机制引入问题特征。同时改进了现有的特征融合方式,消除对输入特征尺寸的限制。VQA数据集的测试结果显示,层次注意力机制使计数类问题的回答准确率提升了4%~5%,其他类型的问题回答准确率也有小幅提升。Visual question answering(VQA)is a new challenge in the field of deep learning.It requires models to make infe-rences and give correct answers based on the semantics of the question and the content of the picture.Aiming at the diversity of picture input,it designed a layered attention mechanism composed of a two-layer attention mechanism stack to help the model accurately locate the problem-related information in the picture.The first-level attention mechanism used the object detection network to extract the features of picture objects,and the second-level attention mechanism merged problem features.At the same time,it improved existing feature fusion methods,and eliminated restrictions on the size of input features.The test results of the VQA dataset show that the layered attention mechanism improves the accuracy of counting questions by 4%~5%,and the accuracy of answering other types of questions is also improved.

关键词：视觉问答注意力机制特征融合目标检测

分类号：TP391.41[自动化与计算机技术—计算机应用技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于层次注意力机制的高效视觉问答模型被引量：9

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于层次注意力机制的高效视觉问答模型 被引量：9

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

基于层次注意力机制的高效视觉问答模型被引量：9