时域孪生网络融合Transformer的长时无人机视觉跟踪  被引量:1

Long-term UAV Vision Tracking with Time Domain Siamese Network Fusion Transformer

在线阅读下载全文

作  者:谌海云[1] 余鹏 王海川 SHEN Haiyun;YU Peng;WANG Haichuan(School of Electrical Information,Southwest Petroleum University,Chengdu 610500,Sichuan,China)

机构地区:[1]西南石油大学电气信息学院,四川成都610500

出  处:《计算机工程》2024年第11期107-118,共12页Computer Engineering

基  金:智能电网与智能控制南充市重点实验室平台建设(二期)项目(SXHZ053)。

摘  要:针对无人机(UAV)执行跟踪任务时经常出现尺寸变化、低分辨率、目标遮挡等场景导致跟踪目标框漂移的问题,提出一种时域孪生网络融合Transformer的长时无人机视觉跟踪算法TTTrack。首先,使用基于孪生网络的SiamFC++(AlexNet)算法作为基线算法;其次,利用Transformer自适应地提取历史帧的时空信息并在线更新模板,从而将时空上下文信息储存为动态模板;随后,分别使用基准模板和动态模板与搜索特征图进行互相关运算,获得响应图后利用Transformer融合两个响应图,从而在连续帧之间建立时空上下文映射关系。实验结果表明,在LaSOT长序列跟踪基准上TTTrack的成功率和精确率分别为63.9%和66.6%,在UAV123跟踪基准上的成功率和精确率分别为61.4%和80.2%。与基线算法相比,该算法在完全遮挡场景下的成功率和精确率分别提升7.4和8.0个百分点。TTTrack在DTB70跟踪基准上精确率达到82.1%,并且跟踪速度为118 帧/s,满足实时性要求。测试结果验证了TTTrack具有良好的鲁棒性、实时性和抗干扰能力,能有效应对长时UAV跟踪任务。Frame drift often occurs when a Unmanned Aerial Vehicle(UAV)performs tracking tasks involving size changes,low resolution,and target occlusion.To that end,this study proposes a time-domain Siamese network fusion Transformer long-term UAV vision,which is called TTTrack.First,the SiamFC++(AlexNet)algorithm based on the Siamese network is used as the baseline algorithm;Second,the Transformer is used to adaptively extract the spatio-temporal information of the historical frame and update the template online to store the spatio-temporal context information as a dynamic template;Third,the benchmark template is cross-correlated with the dynamic template and the search feature map is carried out to obtain two response maps;Finally,the Transformer is used to fuse the two response maps to establish a spatio-temporal context mapping relationship between consecutive frames.Based on the LaSOT long-sequence tracking benchmark,the success rate and accuracy of TTTrack are 63.9%and 66.6%,respectively.The success rate and accuracy of the UAV123 tracking benchmark are 61.4%and 80.2%,respectively.Compared with the baseline algorithm,the success rate and accuracy of this algorithm in fully occluded scenes increased by 7.4 percent and 8.0 percent points,respectively.TTTrack has an accuracy of 82.1%on the DTB70 tracking benchmark and a tracking speed of 118 frame/s,satisfying real-time requirements.The test results show that the proposed algorithm has good robustness,real-time performance,and anti-interference ability and can effectively handle long-term UAV tracking tasks.

关 键 词:时域孪生网络 Transformer模型 无人机 视觉跟踪 时空信息 

分 类 号:TP391.4[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象