混合采样下多级特征聚合的视频目标检测算法被引量：1

Video object detection algorithm based on multi-level feature aggregation under mixed sampler

作　　者：秦思怡盖绍彦[1,2] 达飞鹏[1,2] QIN Siyi;GAI Shaoyan;DA Feipeng(School of Automation,Southeast University,Nanjing 210096,China;Key Laboratory of Measurement and Control of Complex Engineering Systems,Ministry of Education,Southeast University,Nanjing 210096,China)

机构地区：[1]东南大学自动化学院,江苏南京210096 [2]东南大学复杂工程系统测量与控制教育部重点实验室,江苏南京210096

出　　处：《浙江大学学报（工学版）》2024年第1期10-19,共10页Journal of Zhejiang University：Engineering Science

基　　金：江苏省前沿引领技术基础研究专项项目(BK20192004C);江苏省高校优势学科建设工程资助项目。

摘　　要：针对现有基于深度学习的视频目标检测算法无法同时满足精度和效率要求的问题,在单阶段检测器YOLOX-S的基础上,提出基于混合加权采样和多级特征聚合注意力的视频目标检测算法.混合加权参考帧采样(MWRS)策略采用加权随机采样操作和局部连续采样操作,充分利用有效的全局信息与帧间局部信息.多级特征聚合注意力模块(MFAA)基于自注意力机制,对YOLOX-S提取的分类特征进行细化,使得网络从不同层次的特征中学到更加丰富的特征信息.实验结果表明,所提算法在ImageNet VID数据集上的检测精度均值AP50达到77.8%,平均检测速度为11.5 ms/帧,在检测图片上的目标分类和定位效果明显优于YOLOX-S,表明所提算法达到了较高的精度,具有较快的检测速度.A video object detection algorithm which was built upon the YOLOX-S single-stage detector based on mixed weighted reference-frame sampler and multi-level feature aggregation attention was proposed aiming at the problems of existing deep learning-based video object detection algorithms failing to simultaneously meet accuracy and efficiency requirements.Mixed weighted reference-frame sampler(MWRS)included weighted random sampling and local consecutive sampling to fully utilize effective global information and inter-frame local information.Multilevel feature aggregation attention(MFAA)module refined the classification features extracted by YOLOX-S based on self-attention mechanism,encouraging the network to learn richer feature information from multi-level features.The experimental results demonstrated that the proposed algorithm achieved an average precision AP50 of 77.8%on the ImageNet VID dataset with an average detection speed of 11.5 milliseconds per frame.The object classification and location performance are significantly better than that of YOLOX-S,indicating that the proposed algorithm achieves higher accuracy and faster detection speed.

关键词：机器视觉视频目标检测特征聚合注意力机制 YOLOX

分类号：TP391[自动化与计算机技术—计算机应用技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

混合采样下多级特征聚合的视频目标检测算法被引量：1

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

混合采样下多级特征聚合的视频目标检测算法 被引量：1

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

混合采样下多级特征聚合的视频目标检测算法被引量：1