Siamese transformer with hierarchical concept embedding for fine-grained image recognition  被引量:1

在线阅读下载全文

作  者:Yilin LYU Liping JING Jiaqi WANG Mingzhe GUO Xinyue WANG Jian YU 

机构地区:[1]School of Computer and Information Technology,Beijing Jiaotong University,Beijing 100044,China [2]Beijing Key Lab of Traffic Data Analysis and Mining,Beijing Jiaotong University,Beijing 100044,China [3]Alibaba Group,Beijing 100102,China

出  处:《Science China(Information Sciences)》2023年第3期184-199,共16页中国科学(信息科学)(英文版)

基  金:supported by National Key Research and Development Program of China (Grant No. 2020AAA0106800);Beijing Natural Science Foundation (Grant Nos. Z180006, L211016);National Natural Science Foundation of China (Grant No. 62176020);CAAI-Huawei Mind Spore Open Fund;Chinese Academy of Sciences (Grant No. OEIP-O-202004)

摘  要:Distinguishing the subtle differences among fine-grained images from subordinate concepts of a concept hierarchy is a challenging task.In this paper,we propose a Siamese transformer with hierarchical concept embedding(STrHCE),which contains two transformer subnetworks sharing all configurations,and each subnetwork is equipped with the hierarchical semantic information at different concept levels for fine-grained image embeddings.In particular,one subnetwork is for coarse-scale patches to learn the discriminative regions with the aid of the innate multi-head self-attention mechanism of the transformer.The other subnetwork is for finer-scale patches,which are adaptively sampled from the discriminative regions,to capture subtle yet discriminative visual cues and eliminate redundant information.STrHCE connects the two subnetworks through a score margin adjustor to enforce the most discriminative regions generating more confident predictions.Extensive experiments conducted on four commonly-used benchmark datasets,including CUB-200-2011,FGVC-Aircraft,Stanford Dogs,and NABirds,empirically demonstrate the superiority of the proposed STrHCE over state-of-the-art baselines.

关 键 词:fine-grained image recognition TRANSFORMER hierarchical concept embedding adaptive sampling Siamese network 

分 类 号:TN957.52[电子电信—信号与信息处理]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象