检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:任书玉 汪晓丁[1] 林晖[1] REN Shuyu;WANG Xiaoding;LIN Hui(School of Computer and Cyber Space Security,Fujian Normal University,Fuzhou 350117,Fujian,China)
机构地区:[1]福建师范大学计算机与网络空间安全学院,福建福州350117
出 处:《计算机工程》2024年第12期16-32,共17页Computer Engineering
基 金:国家自然科学基金(61702103,U1905211);福建省自然科学基金(2020J01167,2020J01169)。
摘 要:Transformer在自然语言处理中表现出优越的性能激励了研究人员开始探索其在计算机视觉任务中的应用。基于Transformer的目标检测模型DETR将目标检测视为一个集合预测问题,引入Transformer模型来解决目标检测任务,从而避免了传统方法中的提案生成和后处理步骤。最初的DETR在训练收敛和小物体检测方面存在速度慢、效率低的问题。为了解决这些问题,研究人员进行了多方面改进,提升了DETR的性能。对DETR的基本模块和增强模块进行深入研究,包括对主干结构的修改、查询设计策略和注意力机制的改进,同时对各种检测器进行比较分析,评估它们的性能和网络架构,探讨了DETR在计算机视觉任务中的潜力和应用前景以及目前存在的局限性和面临的挑战,并对相关模型进行分析与总结。根据目标检测发展的现状,分析注意力模型的优势与局限性,并对注意力模型在目标检测领域的研究方向加以展望。The superior performance of Transformer in natural language processing has inspired researchers to explore their applications in computer vision tasks.The Transformer-based object detection model,Detection Transformer(DETR),treats object detection as a set prediction problem,introducing the Transformer model to address this task and eliminating the proposal generation and post-processing steps that are typical of traditional methods.The original DETR model encounters issues related to slow training convergence and inefficiency in detecting small objects.To address these challenges,researchers have implemented various improvements to enhance DETR performance.This study conducts an in-depth investigation of both the basic and enhanced modules of DETR,including modifications to the backbone architecture,query design strategies,and improvements to the attention mechanism.Furthermore,it provides a comparative analysis of various detectors and evaluates their performance and network architecture.The potential and application prospects of DETR in computer vision tasks are discussed herein,along with its current limitations and challenges.Finally,this study analyzes and summarizes related models,assesses the advantages and limitations of attention models in the context of object detection,and outlines future research directions in this field.
关 键 词:注意力机制 计算机视觉 深度学习 DETR模型 目标检测
分 类 号:TP39[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.28