VPM-Net:Person Re-ID Network Based on Visual Prompt Technology and Multi-Instance Negative Pooling  

在线阅读下载全文

作  者:Haitao Xie Yuliang Chen Yunjie Zeng Lingyu Yan Zhizhi Wang Zhiwei Ye 

机构地区:[1]School of Computer Science,Hubei University of Technology,Wuhan,430068,China

出  处:《Computers, Materials & Continua》2025年第5期3389-3410,共22页计算机、材料和连续体(英文)

基  金:funded by the Key Research and Development Program of Hubei Province,China(Grant No.2023BEB024);the Young and Middle-aged Scientific and Technological Innova-tion Team Plan in Higher Education Institutions inHubei Province,China(GrantNo.T2023007);the key projects ofHubei Provincial Department of Education(No.D20161403).

摘  要:With the rapid development of intelligent video surveillance technology,pedestrian re-identification has become increasingly important inmulti-camera surveillance systems.This technology plays a critical role in enhancing public safety.However,traditional methods typically process images and text separately,applying upstream models directly to downstream tasks.This approach significantly increases the complexity ofmodel training and computational costs.Furthermore,the common class imbalance in existing training datasets limitsmodel performance improvement.To address these challenges,we propose an innovative framework named Person Re-ID Network Based on Visual Prompt Technology andMulti-Instance Negative Pooling(VPM-Net).First,we incorporate the Contrastive Language-Image Pre-training(CLIP)pre-trained model to accurately map visual and textual features into a unified embedding space,effectively mitigating inconsistencies in data distribution and the training process.To enhancemodel adaptability and generalization,we introduce an efficient and task-specific Visual Prompt Tuning(VPT)technique,which improves the model’s relevance to specific tasks.Additionally,we design two key modules:the Knowledge-Aware Network(KAN)and theMulti-Instance Negative Pooling(MINP)module.The KAN module significantly enhances the model’s understanding of complex scenarios through deep contextual semantic modeling.MINP module handles samples,effectively improving the model’s ability to distinguish fine-grained features.The experimental outcomes across diverse datasets underscore the remarkable performance of VPM-Net.These results vividly demonstrate the unique advantages and robust reliability of VPM-Net in fine-grained retrieval tasks.

关 键 词:Person re-identification multi-instance negative pooling visual prompt tuning 

分 类 号:TP391.41[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象