基于孪生网络的特征融合位移RGB-T目标跟踪  

FSSiamNet:Feature Fusion Shift Siamese Network for RGB-T Target Tracking

在线阅读下载全文

作  者:李海燕[1] 曹永辉 郎恂 李海江 LI Haiyan;CAO Yonghui;LANG Xun;LI Haijiang(School of Information Science and Engineering,Yunnan University,Kunming 650000,China;Yunnan Communications Investment and Construction Group Co.,Ltd.,Kunming 650000,China)

机构地区:[1]云南大学信息学院,云南昆明650000 [2]云南交通投资建设集团有限公司,云南昆明650000

出  处:《湖南大学学报(自然科学版)》2025年第4期68-78,共11页Journal of Hunan University:Natural Sciences

基  金:国家自然科学基金资助项目(62166048);云南省万人计划“云岭教学名师”;云南省高校重点实验室建设计划资助项目(202101AS070031);云南大学第十四届研究生科研创新项目(KC-22221737)。

摘  要:为解决现有目标跟踪算法深层次特征提取困难、不能充分利用跨模态信息以及目标特征表示较弱等问题,提出了基于孪生网络的特征融合位移RGB-T目标跟踪算法.首先,基于可见光模态SiameseRPN++的目标跟踪框架,扩展设计红外模态分支,以获得多模态目标跟踪框架,设计了改进步长的ResNet50作为特征提取网络,有效挖掘目标的深层次特征.随后,设计特征交互学习模块,利用一种模态的判别信息引导另一种模态的目标外观特征学习,挖掘特征空间和通道中的跨模态信息,增强网络对前景信息的关注.然后,设计多模特征融合模块计算输入的可见光图像和红外图像的特征融合度,对不同模态的重要特征进行空间融合以去除冗余信息,并采用级联融合策略重建多模态图像,增强目标特征表示.最后,设计特征空间位移模块,分割红外模态分支的特征图并向四个不同方向移位,增强热源目标特征的边缘表示.在两个RGB-T数据集上的实验验证了提出算法的有效性,消融实验证明了设计的单个模块的优越性.To solve the problems of the existing target tracking algorithms,such as inability to extract deep-level features,failure to fully exploit cross-modal information,and weak representation of target features,a feature fusion shift Siamese network for RGB-T target tracking is proposed.First,a target tracking framework based on the visible modal SiameseRPN++is designed to extend the infrared modal branch,in order to obtain a multimodal target tracking framework.Moreover,the improved ResNet50 network with adjusted stride as a feature extraction network enables the acquisition of deep-level features of the target.Subsequently,a multimodal feature interactive learning module(FIM)is designed to leverage the discriminative information from one modality to guide the learning process of target appearance features in the other modality.By mining the cross-modal information within the feature space and channels,the module enhances the network’s attention towards foreground information.Thereafter,a multimode feature fusion module(FAM)is designed,which calculates the degree of feature fusion between the input visible light image and the infrared image,enabling spatial fusion of significant features from different modalities to effectively eliminate redundant information and reconstructing multimodal images by employing a cascade fusion strategy.Finally,a feature space shift module(FSM)is designed,which divides the feature maps of the infrared modal branches and shifts them in four different directions to enhance the edge representation of the heat source target.Extensive experiments on two RGB-T datasets thoroughly validate the effectiveness of the proposed algorithm,while ablation experiments demonstrate the superiority of each designed module.

关 键 词:RGB-T跟踪 多模特征融合模块 特征空间位移模块 特征交互学习模块 

分 类 号:TP391.4[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象