A teacher-student based attention network for fine-grainedimage recognition  

作  者:Ang Li Xueyi Zhang Peilin Li Bin Kang 

机构地区:[1]Key Laboratory of Broadband Wireless Communication and Sensor Network Technology(Ministry of Education),Nanjing University of Posts and Telecommunications,Nanjing,210003,China [2]School of Communications and Information Engineering,Nanjing University of Posts and Telecommunications,Nanjing,210003,China [3]School of Internet of Things,Nanjing University of Posts and Telecommunications,Nanjing,210003,China

出  处:《Digital Communications and Networks》2025年第1期52-59,共8页数字通信与网络(英文版)

基  金:supported by the National Natural Science Foundation of China,China (Grants No.62171232);the Priority Academic Program Development of Jiangsu Higher Education Institutions,China。

摘  要:Fine-grained Image Recognition(FGIR)task is dedicated to distinguishing similar sub-categories that belong to the same super-category,such as bird species and car types.In order to highlight visual differences,existing FGIR works often follow two steps:discriminative sub-region localization and local feature representation.However,these works pay less attention on global context information.They neglect a fact that the subtle visual difference in challenging scenarios can be highlighted through exploiting the spatial relationship among different subregions from a global view point.Therefore,in this paper,we consider both global and local information for FGIR,and propose a collaborative teacher-student strategy to reinforce and unity the two types of information.Our framework is implemented mainly by convolutional neural network,referred to Teacher-Student Based Attention Convolutional Neural Network(T-S-ACNN).For fine-grained local information,we choose the classic Multi-Attention Network(MA-Net)as our baseline,and propose a type of boundary constraint to further reduce background noises in the local attention maps.In this way,the discriminative sub-regions tend to appear in the area occupied by fine-grained objects,leading to more accurate sub-region localization.For fine-grained global information,we design a graph convolution based Global Attention Network(GA-Net),which can combine extracted local attention maps from MA-Net with non-local techniques to explore spatial relationship among subregions.At last,we develop a collaborative teacher-student strategy to adaptively determine the attended roles and optimization modes,so as to enhance the cooperative reinforcement of MA-Net and GA-Net.Extensive experiments on CUB-200-2011,Stanford Cars and FGVC Aircraft datasets illustrate the promising performance of our framework.

关 键 词:Fine-grained image recognition Collaborative teacher-student strategy Multi-attention Global attention 

分 类 号:TP3[自动化与计算机技术—计算机科学与技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象