检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:徐领 缪翌 张卫锋 XU Ling;MIAO Yi;ZHANG Weifeng(School of Computer Science and Technology,Zhejiang Sci‐Tech University,Hangzhou 310018,China;School of Information Science and Engineering,Jiaxing University,Jiaxing 314001,China)
机构地区:[1]浙江理工大学计算机科学与技术学院,浙江杭州310018 [2]嘉兴大学信息科学与工程学院,浙江嘉兴314001
出 处:《现代电子技术》2024年第22期44-50,共7页Modern Electronics Technique
摘 要:为了解决跨模态行人检索从图像和文本中抽取有效的细节特征,以及实现图像与自然语言文本跨模态对齐的问题,提出一种基于多尺度特征增强与对齐的跨模态行人检索模型。该模型引入多模态预训练模型,并构建文本引导的图像掩码建模辅助任务,充分实现跨模态交互,从而无需显式地标注信息即可增强模型学习图像局部细节特征的能力。另外,针对行人图像身份易混淆问题,设计全局图像特征匹配辅助任务,引导模型学习身份关注的视觉特征。在CUHK-PEDES、ICFG-PEDES和RSTPReid等多个公开数据集上的实验结果表明,所提模型超越了目前已有的主流模型,其第一命中率分别达到了72.47%、62.71%和59.25%,实现了高准确率的跨模态行人检索。In order to solve the problem of extracting effective detail features from images and texts in cross-modal pedestrian retrieval,as well as achieving cross-modal alignment between images and natural language texts,a cross-modal pedestrian retrieval model based on multi-scale feature enhancement and alignment is proposed.In this model,the multimodal pre-training model is introduced,and the text-guided image mask modeling auxiliary task is constructed to fully realize cross-modal interaction,so as to enhance the model's ability to learn local image detail features without explicit annotation information.In allusion to the identity confusion in person images,a global image feature matching auxiliary task is designed to guide the model to learn visual features that are relevant to identity.The experimental results on multiple public datasets such as CUHK-PEDES,ICFG-PEDES,and RSTPReid show that the proposed model surpasses existing mainstream models,with first hit rates of 72.47%,62.71%,and 59.25%,respectively,achieving high accuracy in cross-modal pedestrian retrieval.
关 键 词:跨模态行人检索 多尺度特征增强 多模态对齐 CLIP 图像掩码 跨模态交互 交叉注意力
分 类 号:TN911-34[电子电信—通信与信息系统] TP391.41[电子电信—信息与通信工程] TP183[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.15