检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:阙越 甘梦晗 刘志伟[1] QUE Yue;GAN Menghan;LIU Zhiwei(School of Information Engineering,East China Jiaotong University,Nanchang 330013,China)
出 处:《计算机科学》2024年第S01期447-452,共6页Computer Science
基 金:国家自然科学基金(62362032);江西省自然科学基金(20232BAB212011)。
摘 要:目标检测旨在实现对图像中目标的精确识别和定位,是计算机视觉中一个重要的研究领域。基于深度学习的目标检测已取得长足的发展,但依然存在不足之处。大的下采样系数带来的语义信息有利于图像分类,但下采样过程中不可避免地会造成信息损失,导致模型特征提取不充分,从而检测准确性下降。针对上述问题,提出一种感受野增强与多分支聚合模型用于目标检测。首先,设计感受野增强模块,以扩大主干网络的感受野。该模块可以获取目标上下文线索,且不改变特征的空间分辨率,可以缓解下采样过程中目标信息丢失问题。然后,为了充分利用卷积神经网络的局部性以及自注意力机制的长距离特征依赖特性,构建感受野扩展复合主干网络,以保留局部特征以及提高模型的全局特征感知能力。最后,提出多分支聚合检测头网络,在3个预测分支之间形成信息流动,融合分支之间的特征信息,以提高模型检测能力。在MS COCO数据集上进行了验证实验,结果表明所提模型的平均精度优于多种主流目标检测模型。Object detection aims to achieve accurate recognition and localization of objects in images and is an important research area in computer vision.Deep learning-based object detection has made great progress,but there are still shortcomings.The semantic information brought by large down-sampling coefficients is beneficial to image classification,but the down-sampling process inevitably brings information loss,resulting in insufficient model feature extraction and thus a decrease in detection accuracy.To address these problems,this paper proposes a receptive field enhancement and multi-branch aggregation network for object detection.First,the receptive field enhancement module is designed to expand the receptive field of the backbone network.This module can acquire object context cues and can alleviate the problem of object information loss during down-sampling because it does not change the feature spatial resolution.Then,in order to take full advantage of the localization of convolutional neural networks and the long-range feature-dependent property of the self-attention mechanism,the receptive field expanding composite backbone network is constructed to retain local features as well as to improve the global feature perception capability of the model.Finally,a multi-branch aggregation detection head network is proposed to form information flow between three prediction branches and fuse feature information between branches to improve the detection capability of the model.Validation experiments are carried out on MS COCO datasets,and the results show that the average accuracy of the proposed model is better than that of many mainstream object detection models.
关 键 词:目标检测 自注意力机制 感受野扩展 特征融合 解耦检测头
分 类 号:TP391.4[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.28