ViT融合局部特征的无监督行人再识别研究  被引量:1

Local Feature Fusion based on Vision Transformer for Unsupervised Person Re-identification

在线阅读下载全文

作  者:杨文琴[1] 丁肇丰 宋志刚[1] YANG Wenqin;DING Zhaofeng;SONG Zhigang(The Academy of Digital China(Fujian),Fuzhou University,Fuzhou,China,350108;School of Physics and Information Engineering,Fuzhou University,Fuzhou,China,350108)

机构地区:[1]福州大学数字中国研究院(福建),福州350108 [2]福州大学物理与信息工程学院,福州350108

出  处:《福建电脑》2023年第12期8-14,共7页Journal of Fujian Computer

基  金:国家自然科学基金面上项目(No.62271149);中央引导地方科技发展专项(No.2022L3003);省教育厅中青年项目(No.JAT200039)资助。

摘  要:目前无监督行人再识别是直接利用全局特征计算相似性,这会导致伪标签生成质量不佳,并且基于卷积神经网络的方法可能会产生细节丢失。针对这些问题,本文设计了一种基于局部特征融合的无监督行人再识别网络,通过随机滑窗的图像编码方式来解决信息丢失问题。为了产生可靠的伪标签,通过Vision Transformer独有的“抛弃特征”作为局部变量来计算局部相似性,并且融合摄像头相似性以提高总体样本相似性计算的准确性,提升伪标签生成的质量。实验结果表明,该方法在公开数据集Market-1501和DukeMTMC-reID上可以大幅提升模型性能。At present,unsupervised pedestrian recognition directly utilizes global features to calculate similarity,which can lead to poor quality of pseudo label generation,and methods based on convolutional neural networks may result in detail loss.In response to these issues,this article designs an unsupervised pedestrian recognition network based on local feature fusion,which solves the problem of information loss through random sliding window image encoding.In order to generate reliable pseudo labels,the unique"discarded features"of Vision Transformer are used as local variables to calculate local similarity,and camera similarity is fused to improve the accuracy of overall sample similarity calculation and improve the quality of pseudo label generation.The experimental results show that this method can significantly improve model performance on publicly available datasets Market-1501 and DukeMTMC reID.

关 键 词:行人再识别 视觉Transformer 无监督学习 聚类算法 

分 类 号:TP39[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象