基于时空Transformer的视觉目标跟踪算法

Visual object tracking algorithm based on spatio-temporal transformer

作　　者：武晓军陈怡丹冯丽萍[1] 宋长伟何德清 WU Xiaojun;CHEN Yidan;FENG Liping;SONG Changwei;HE Deqing(Xinzhou Normal University,Xinzhou 034000,China;Henan Open University,Zhengzhou 450046,China;College of Information and Management Science,Henan Agricultural University,Zhengzhou 450003,China;School of Computer Science&Technology,Huazhong University of Science and Technology,Wuhan 430074,China)

机构地区：[1]忻州师范学院,山西忻州034000 [2]河南开放大学,河南郑州450046 [3]河南农业大学信息与管理科学学院,河南郑州450003 [4]华中科技大学计算机科学与技术学院,湖北武汉430074

出　　处：《传感器与微系统》2025年第3期152-155,共4页Transducer and Microsystem Technologies

基　　金：国家自然科学基金资助项目(61902139);山西省基础研究计划项目(202203021211116);忻州师范学院项目(2021KY16)。

摘　　要：视觉目标跟踪中,由于目标移动速度不同,连续帧对时空邻域的贡献程度也不同。为学习视频帧对邻域信息的贡献,结合自注意力机制学习不同帧的权重大小,提出了一种基于时空Transformer的视觉目标跟踪方法。该算法主要通过关联多帧特征,并在时域上进行信息聚合。首先,将图像通过空间Transformer编码器(STE)对空间特征进行编码。然后,通过时空Transformer解码器(STD)模块在时间维度上聚合帧间信息,以捕获时间和空间的全局上下文信息。最后,在LaSOT、GOT—10k等主流数据集进行测评。实验结果表明:算法在精度、成功率及其他评价指标上取得了一定程度的提升。In visual object tracking,different continuous contribute degrees to the spatio-temporal neighborhood are different due are different object moving speeds.In order to learn the contribution of video frames to the neighborhood information,a visual object tracking method based on spatio-temporal Transformer is proposed by combining the self-attention mechanism to learn the weight magnitude of different frames.The algorithm mainly works by associating multi-frame features and aggregating the information in the time domain.The images are firstly encoded by spatial transformer enconder(STE)spatial features,and then inter-frame information is aggregated in the temporal dimension by spatial-temporal transformer deconder(STD)module to capture the global contextual information in time and space.Finally,it is measured in mainstream datasets,and the experimental results show that the algorithm achieves a certain degree of improvement in precision,success rate,and other evaluation metrics.

关键词：视觉跟踪 TRANSFORMER 时空特征自注意力特征编码

分类号：TP391.41[自动化与计算机技术—计算机应用技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于时空Transformer的视觉目标跟踪算法

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于时空Transformer的视觉目标跟踪算法

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索