时空特征强化与感知的视觉目标跟踪方法  

Visual object tracking with spatial-temporal feature enhancement and perception

作  者:郭虎升[1,2] 刘正琪 刘艳杰 王文剑 GUO Husheng;LIU Zhengqi;LIU Yanjie;WANG Wenjian(School of Computer and Information Technology,Shanxi University,Taiyuan 030006,Shanxi,China;Key Laboratory of Computational Intelligence and Chinese Information Processing(Shanxi University),Ministry of Education,Taiyuan 030006,Shanxi,China)

机构地区:[1]山西大学计算机与信息技术学院,山西太原030006 [2]计算智能与中文信息处理教育部重点实验室(山西大学),山西太原030006

出  处:《陕西师范大学学报(自然科学版)》2025年第1期60-70,共11页Journal of Shaanxi Normal University:Natural Science Edition

基  金:国家自然科学基金(62276157,62476157,U21A20513,62076154,61503229);山西省重点研发计划(202202020101003)。

摘  要:多数基于Transformer的目标跟踪模型提取的目标局部空间特征信息有限且时间特征利用不足,显著影响了目标跟踪模型在处理目标遮挡、形变或尺度变化等复杂场景下的性能。为此,提出一种时空特征强化与感知的视觉目标跟踪方法(visual object tracking method with spatial-temporal feature enhancement and perception,STFEP)。一方面,该方法使用Transformer进行搜索区域与时间上下文特征的提取与融合,以得到全局特征信息,通过设计的局部卷积神经网络,提取目标的局部特征信息,并与目标的全局特征信息相关联,进一步强化目标的特征表示。另一方面,提出了时空特征感知机制,对不同时刻的特征信息进行可靠性和必要性分析,构建动态模板以感知更丰富的时空信息,使模型适应目标及场景的复杂变化。在TrackingNet、GOT-10k、LaSOT、UAV123多个数据集上的实验结果表明,研究所提方法能够准确鲁棒的对目标进行跟踪,并在GOT-10k数据集上取得了最优的结果,AO、SR 0.5以及SR 0.75分别达到了73.7%、83.8%、70.6%。Most Transformer-based object tracking models have limited extraction of target's local spatial feature information and insufficient utilization of temporal features,significantly affecting the performance of object tracking models in handling complex scenarios such as target occlusion,deformation,or scale changes.Therefore,a visual object tracking method with spatial-temporal feature enhancement and perception(STFEP)are proposed in this paper.On one hand,this method uses Transformer for the extraction and fusion of search region and temporal context features to obtain global feature information.By designing a local convolutional neural network,it extracts the target's local feature information and associates it with the target's global feature information,further enhancing the target's feature representation.On the other hand,a spatial-temporal feature perception mechanism is proposed to analyze the reliability and necessity of feature information at different moments,constructing dynamic templates to perceive richer spatial-temporal information,enabling the model to adapt to complex changes in targets and scenes.Experimental results on multiple datasets such as TrackingNet,GOT-10k,LaSOT and UAV123 show that the proposed method can track the target accurately and robustly,and the optimal results are obtained on GOT-10k dataset.AO,SR 0.5 and SR 0.75 were 73.7%,83.8%and 70.6%,respectively.

关 键 词:视觉目标跟踪 时空特征强化 全局-局部信息关联 时空特征感知 动态模板 

分 类 号:TP18[自动化与计算机技术—控制理论与控制工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象