检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:李娜 刘蒙巧 潘金婷 黄开 贾兴轩 LI Na;LIU Mengqiao;PAN Jinting;HUANG Kai;JIA Xingxuan(School of Communication and Information Engineering,Xi’an University of Posts and Telecommunications,Xi’an 710121,China)
机构地区:[1]西安邮电大学通信与信息工程学院,陕西西安710121
出 处:《光学精密工程》2025年第4期653-664,共12页Optics and Precision Engineering
基 金:国家自然科学基金项目(No.41874173);陕西省工业攻关项目(No.2022GY-078)。
摘 要:针对复杂深度模型在运算资源受限条件下难以实现高精度、高帧率跟踪的问题,本文提出了一种基于知识蒸馏的Transformer视觉跟踪器。该视觉跟踪器通过引入图像动态校正模块,将当前帧搜索图像与基于光流的预测图像进行动态融合,能有效应对目标快速移动、运动模糊等挑战。为了降低模型复杂度,本文采用知识蒸馏学习策略对模型进行压缩,并将同方差不确定性融入损失函数中,通过学习可得到不同子任务的损失权重,从而避免手动调参的繁琐与困难。同时,在训练过程中,采用随机模糊策略以增强模型的鲁棒性。本文提出了两种不同复杂度的跟踪框架:KTransT-T和KTransT,并在五个公开数据集上与12种算法进行了对比实验。实验结果表明:KTransT-T算法有效提高了跟踪精度和成功率,KTransT则在保证较低模型复杂度的同时,达到了与主流算法相当的跟踪精度,其跟踪速度可达158 frame/s,满足实时跟踪的需求。To achieve high-precision and real-time tracking with limited computing resources,a transformer-based visual tracker via knowledge distillation was proposed.By introducing the image dynamic correction module,our tracker fused the search image of the current frame with the predicted image based on optical flow,which could effectively deal with challenges such as fast motion and motion blur.In order to reduce model complexity,the knowledge distillation learning strategy was adopted to compress the model.By introducing homoscedastic uncertainty into the loss function,loss weights of different subtasks could be learned through our network,thereby avoiding the cumbersome and difficult manual parameter tuning.Additionally,during training for the student network,a random blurring strategy was employed to enhance model robustness.Two tracking frameworks with different complexities,named KTransT-T and KTransT,were proposed and compared with 12 algorithms on 5 public datasets.Experimental results show that KTransT-T has significant advantages in precision and success rate,while KTransT has lower model complexity and competitive tracking performance.KTransT runs at a speed of up to 158 frames per second,which can meet the requirements of real-time tracking.
关 键 词:计算机视觉 目标跟踪 TRANSFORMER 知识蒸馏 同方差不确定性
分 类 号:TP391.4[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.28