基于标签相关性学习网络的多标签图像分类方法  

A Multi-Label Image Classification Method based on Label Correlation Learning Network

在线阅读下载全文

作  者:王路芳[1] 张海云 WANG Lufang;ZHANG Haiyun(Experimental Training Center,Shanxi University of Finance and Economics,Taiyuan 030031,China;Institute of Big Data Science and Industry,Shanxi University,Taiyuan 030006,China;School of Computer and Information Technology,Shanxi University,Taiyuan 030006,China)

机构地区:[1]山西财经大学实验实训中心,太原030031 [2]山西大学大数据科学与产业研究院,太原030006 [3]山西大学计算机与信息技术学院,太原030006

出  处:《太原理工大学学报》2024年第6期1097-1106,共10页Journal of Taiyuan University of Technology

基  金:国家自然科学基金资助项目(62072291)。

摘  要:【目的】针对多标签图像分类任务中的标签特征混乱和标签关系局限性问题,提出了一种基于标签相关性学习网络的多标签图像分类方法(MLLCLN)。【方法】采用掩码注意力方法和多头自注意力机制。在掩码注意力方法中,通过图像真实标签对应的状态词向量遮盖注意力机制产生的标签特征,使模型能获得更多上下文信息,一定程度上避免了注意力机制的注意力区域重叠的问题。设计了标签相关性学习网络,该网络是由多层多头注意力机制和图神经网络组成。多头自注意力机制能够基于标签特征学习局部标签关系,而图神经网络使用了现有的方法MLGCN作为引导,使模型能同时考虑全局标签关系,缓解了由于标签关系局限性导致的模型出现虚假预测的问题。【结果】MLLCLN在公开数据集MSCOCO2014和VOC2007上的实验结果表明了其较好的性能,分类精度分别达到了84.4%和96.0%,为多标签图像分类提供了新思路。【Purposes】To meet the challenges posed by label feature confusions and limitations in label relationships in multi-label image classification tasks,a novel approach to multi-label image classification based on label correlation learning network(MLLCLN)is presented in this work.【Methods】MLLCLN adopts the methods of masked attention approach and multi-head selfattention mechanism.In the masked attention approach,the label features generated by masking the attention mechanism with state word vectors corresponding to the real labels in the image,allowing the model to obtain more contextual information and mitigating the issue of attention overlap in the attention regions.This strategy effectively alleviate the issue of label feature confusion.Moreover,a label correlation learning network is devised,which comprises multiple layers of multi-head attention mechanisms and a graph neural network.On the other hand,the multi-head self-attention mechanism enables the learning of local label relationships according to the label features,while the graph neural network incorporates the widely adopted ML-GCN method to guide the model in considering global label relationships simultaneously,mitigating the issue of false predictions in models caused by the limitations of label relationships.【Findings】The experimental results of MLLCLN on the public datasets MSCOCO2014 and VOC2007 demonstrate its superior performance,achieving classification accuracies of 84.4% and 96.0%,respectively.This provides a novel approach to multi-label image classification.

关 键 词:多头自注意力机制 多标签图像分类 注意力机制 自适应权重 卷积神经网络 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象