基于类相似特征扩充与中心三元组损失的哈希图像检索  被引量:3

Hash Image Retrieval Based on Category Similarity Feature Expansion and Center Triplet Loss

在线阅读下载全文

作  者:潘丽丽[1] 马俊勇 熊思宇 邓智茂 胡清华[2] PAN Lili;MA Junyong;XIONG Siyu;DENG Zhimao;HU Qinghua(College of Computer and Information Engineering,Central South University of Forestry and Technology,Changsha 410004;College of Intelligence and Computing,Tianjin University,Tianjin 300350)

机构地区:[1]中南林业科技大学计算机与信息工程学院,长沙410004 [2]天津大学智能与计算学部,天津300350

出  处:《模式识别与人工智能》2023年第8期685-700,共16页Pattern Recognition and Artificial Intelligence

基  金:湖南省自然科学基金面上项目(No.2021JJ31164);湖南省教育厅科学研究重点项目(No.22A0195)资助。

摘  要:现有的深度哈希图像检索方法主要采用卷积神经网络,提取的深度特征的相似性表征能力不足.此外,三元组深度哈希主要从小批量数据中构建局部三元组样本,样本数量较少,数据分布缺失全局性,使网络训练不够充分且收敛困难.针对上述问题,文中提出基于类相似特征扩充与中心三元组损失的哈希图像检索模型(Hash Image Retrieval Based on Category Similarity Feature Expansion and Center Triplet Loss,HRFT-Net).设计基于Vision Transformer的哈希特征提取模块(Hash Feature Extraction Module Based on Vision Transformer,HViT),利用Vision Transformer提取表征能力更强的全局特征信息.为了扩充小批量训练样本的数据量,提出基于类约束的相似特征扩充模块(Similar Feature Expansion Based on Category Constraint,SFEC),利用同类样本间的相似性生成新特征,丰富三元组训练样本.为了增强三元组损失的全局性,提出基于Hadamard的中心三元组损失函数(Central Triplet Loss Function Based on Hadamard,CTLH),利用Hadamard为每个类建立全局哈希中心约束,通过增添局部约束与全局中心约束的中心三元组加速网络的学习和收敛,提高图像检索的精度.在CIFAR10、NUS-WIDE数据集上的实验表明,HRFT-Net在不同长度比特位哈希码检索上的平均精度均值较优,由此验证HRFT-Net的有效性.Convolutional neural networks are commonly employed in the existing deep hashing image retrieval methods.The similarity representation of the deep features extracted by convolutional neural networks is insufficient.In addition,the local triplet samples are mainly constructed for triplet deep hashing from the small batch data,the size of the local triplet samples is small and the data distribution is lack of globality.Consequently,the network training is insufficient and the convergence is difficult.To address these issues,a model of hash image retrieval based on category similarity feature expansion and center triplet loss is proposed.A hash feature extraction module based on vision transformer is designed to extract global feature information with stronger representation ability.To expand the size of mini-batch training samples,a similar feature expansion module based on category constraint is put forward.New feature is generated by the similarity among samples of the same category to enrich the triplet training samples.To enhance the global ability of triplet loss,a center triplet loss function based on Hadamard(CTLH)is constructed.Hadamard is utilized to establish the global hash center constraint for each class.With CLTH,the learning and the convergence of the network are accelerated by adding the center triplet of local constraint and global center constraint,and the accuracy of image retrieval is improved.Experiments on CIFAR10 and NUS-WIDE datasets show that HRFT-Net gains better mean average precision for image retrieval with different bit lengths of hash code,and the effectiveness of HRFT-Net is demonstrated.

关 键 词:图像检索 深度哈希 VISION Transformer(ViT) 特征扩充 三元组损失 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象