检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:薛万利 张智彬 裴生雷[2] 张开华[3] 陈胜勇 Xue Wanli;Zhang Zhibin;Pei Shenglei;Zhang Kaihua;Chen Shengyong(School of Computer Science and Engineering,Tianjin University of Technology,Tianjin 300384;College of Physics and Electronic Information Engineering,Qinghai Minzu University,Xining 810007;School of Computer Science,Nanjing University of Information Science&Technology,Nanjing 130012)
机构地区:[1]天津理工大学计算机科学与工程学院,天津300384 [2]青海民族大学物理与电子信息工程学院,西宁810007 [3]南京信息工程大学计算机学院,南京130012
出 处:《计算机研究与发展》2024年第2期460-469,共10页Journal of Computer Research and Development
基 金:国家自然科学基金项目(62376197,61906135,61876088,92048301,62020106004);江苏省333工程人才项目(BRA2020291)。
摘 要:当前基于Transformer的主流跟踪框架在特征提取及融合方面存在3个问题:1)分开进行特征提O(N^(2))取与融合,易产生次优模型训练结果;2)使用计算复杂度为的自注意力机制会降低跟踪算法效率;3)简单的目标模板选取策略难以自适应跟踪过程中目标表观的剧烈变化.为此,利用快速傅里叶变换对目标与搜索区域的令牌进行有效混合,提出一种新颖的基于Transformer的视觉目标跟踪方案.针对问题1提出一种高效端到端方式将特征提取与融合进行统一学习以获得最优模型.针对问题2采用快速傅里叶变换实现目标与搜索区域令牌之间的完全信息交互,该操作计算复杂度为O(Nlog (N)),有助于提升跟踪效率.针对问题3提出一种基于跟踪质量评估的目标模板记忆存储机制以快速自适应目标表观的剧烈变化.在3个标准数据集LaSOT,OTB100,UAV123上,所提方法与当前最优方法相比在效率和精度上均取得更好表现.There are three problems about feature extraction and fusion in the current mainstream tracking framework based on Transformer:1.The two modules of feature extraction and fusion are used separately,which is easy to produce sub-optimal model training results.2.Computational complexity of using self-attention reduces O(N^(2))tracking efficiency.3.The target template selection strategy is simple and is difficult to adapt to the drastic changes in the target appearance during the tracking process.We propose a novel Transformer tracking framework using fast Fourier transform mixing target tokens and search region tokens.For problem 1,an efficient end-to-end approach is proposed to extract and fuse features for unified learning to obtain optimal model;For problem 2,the fast Fourier transform is used to achieve complete information interaction between the target tokens and search region tokens.The computational complexity of this operation is,which greatly improves the tracing efficiency.For O(Nlog(N))problem 3,a template memory storage mechanism based on quality assessment is proposed,which can quickly adapt to the drastic changes in target appearance.Compared with the current state-of-the-art algorithms on three datasets LaSOT,OTB100 and UAV123,our tracker achieves better performance in both efficiency and accuracy.
关 键 词:TRANSFORMER 快速傅里叶变换 特征提取 特征融合 目标跟踪
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.145