检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:赵伟 刘磊[1,2] 王鲲鹏 涂铮铮 罗斌 ZHAO Wei;LIU Lei;WANG Kunpeng;TU Zhengzheng;LUO Bin(Anhui Provincial Key Laboratory of Multimodal Cognitive Computation,Hefei 230601,China;School of Computer Science and Technology,Anhui University,Hefei 230601,China)
机构地区:[1]多模态认知计算安徽省重点实验室,合肥230601 [2]安徽大学计算机科学与技术学院,合肥230601
出 处:《北京航空航天大学学报》2024年第2期596-605,共10页Journal of Beijing University of Aeronautics and Astronautics
基 金:国家自然科学基金(62376005);安徽省重点研发计划(202104d07020008,KJ2020A0033);安徽省自然科学基金(2108085MF211);安徽省高校协同创新项目(GXXT-2022-014)。
摘 要:可见光-热红外(RGBT)目标跟踪旨在挖掘可见光和热红外数据的互补优势,实现鲁棒的目标跟踪。目前主流方法通常引入模态权重来实现多模态信息融合,但简单地为各个模态分配权重无法充分挖掘可见光和热红外模态的互补优势。基于此,提出了一种多模态双向信息增强的RGBT跟踪网络(MBIENet)。设计了一种特征聚合模块,用于聚合模态共享特征和模态特定特征以建模目标外观信息;提出了一种新的多模态双向调制融合模块,可有效融合模态互补信息,减少冗余特征和无用特征对跟踪器的影响;提出了一个轻量化的通道空间注意力模块,可自适应调整不同环境下不同模态的贡献。在GTOT、RGBT234和LasHeR数据集上的实验结果表明:所提跟踪算法的准确率和成功率优于当前主流的跟踪算法。The goal of RGB-thermal infrared(RGBT)visual object tracking,which has drawn increasing interest in recent years,is to take advantage of the complimentary strengths of RGB and thermal infrared picture data to accomplish reliable visual tracking.For obtaining a robust appearance representation of an object,existing mainstream methods introduced the modal weight to fuse information of two modalities.Simply assigning weights to the individual modalities can’t fully explore the complementary benefits of RGB and thermal infrared modalities.To solve these problems,propose a novel multimodal bidirectional information enhancement network for RGBT tracking(MBIENet).Specifically,design a feature aggregation module to aggregate modality-shared and modality-specific features for modeling the appearance information of an object.Further proposes a novel multimodal bidirectional modulation fusion module that can effectively fuse the complementary information of two modalities and alleviate the impact of redundant and useless features on the tracker.The contributions of various modalities in various situations are then adaptively adjusted using a lightweight channel-spatial attention module that is proposed.Experimental results on GTOT,RGBT234,and LasHeR datasets show that the accuracy rate and success rate of the proposed method are better than the existing mainstream trackers.
关 键 词:可见光-热红外 目标跟踪 深度学习 多模态信息融合 多模态信息交互
分 类 号:TP183[自动化与计算机技术—控制理论与控制工程]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.15