基于空间注意力与图卷积的多标签图像分类算法被引量：1

Multi-label image classification algorithm based on spatial attention and graph convolution

作　　者：康萍萍侯进[2,3] 周浩然陈子锐李晨[1,3] KANG Pingping;HOU Jing;ZHOU Haoran;CHEN Zirui;LI Chen(School of Computer and Artificial Intelligence,Southwest Jiaotong University,Chengdu 611756,China;Laboratory of Intelligent Perception and Smart Operation&Maintenance,School of Information Science and Technology,Southwest Jiaotong University,Chengdu 611756,China;National Engineering Laboratory of Integrated Transportation Big Data Application Technology,Southwest Jiaotong University,Chengdu 611756,China)

机构地区：[1]西南交通大学计算机与人工智能学院,四川成都611756 [2]西南交通大学信息科学与技术学院智能感知智慧运维实验室,四川成都611756 [3]西南交通大学综合交通大数据应用技术国家工程实验室,四川成都611756

出　　处：《微电子学与计算机》2022年第5期10-19,共10页Microelectronics & Computer

基　　金：四川省科技计划项目(2020SYSY0016);国家重点研发计划(2020YFB1711902)。

摘　　要：针对传统多标签图像分类模型存在难以生成更接近相关标签的高层图像特征,以及因未能利用标签之间的视觉相关性而导致的识别精度不够高等问题,提出了一种基于空间注意力与图卷积的多标签图像分类算法.首先,利用图卷积网络学习标签邻接图特征和使用GLOVE算法,从标签序列获取的标签嵌入;其次,在高层语义信息中引入改进的空间注意力网络以对特定类别的语义特征进行重标定,实现背景和干扰信息的抑制;最后,在基于共现特征融合的分类器中,整合高层语义信息与图卷积网络提取的标签共现特征,采用通道一对一的方式完成模型最终预测.在两个公开数据集上进行实验表明,该算法在MS-COCO和VOC-2007数据集上的平均精度分别为81.42%和94.3%,较基础的MLGCN网络分别提升了1.13和1.3个百分点,且模型参数量仅为原模型的八分之一,训练过程中需要的迭代次数也远少于原模型,极大程度地降低了其训练成本.For traditional multi-label image classification models,it is difficult to generate high-level image features that are closer to related labels,and the visual correlation between the labels is not used,which leads to problems such as insufficient recognition accuracy.A multi-label image classification algorithm based on spatial attention and graph convolutionis proposed in this paper.Firstly,the graph convolutional network is used to learn the features of the label adjacency graph and the GLOVE algorithm is usedto obtain the label embedding from the label sequence.Secondly,an improved spatial attention networkis introducedin the high-level semantic information to re-calibrate the semantic features of a specific category and suppress background and interference information.Finally,the high-level semantic information with the tags extracted by the graph convolutional network in the classifier based on co-occurrence feature fusionare integrated,and the final prediction of the modelis completed in the channel one-to-one method.Experiments on two public data sets show that the average accuracy of the proposedalgorithmon the MS-COCO and VOC-2007 data sets are 81.42% and 94.3%,which are 1.13 and 1.3 percentage points higher than the basic MLGCN.The amount of model parameters is only one-eighth of the original model,and the number of iterations required in the training process is far less than that of the original model,which greatly reduces its training cost.

关键词：图卷积网络多标签图像分类空间注意力特征融合

分类号：TP391[自动化与计算机技术—计算机应用技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于空间注意力与图卷积的多标签图像分类算法被引量：1

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于空间注意力与图卷积的多标签图像分类算法 被引量：1

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

基于空间注意力与图卷积的多标签图像分类算法被引量：1