基于特征对齐和高斯表征的视觉有向目标检测  被引量:2

Visual oriented object detection via feature alignment and Gaussian parameterization

在线阅读下载全文

作  者:杨学 严骏驰 Xue YANG;Junchi YAN(Department of Computer Science and Engineering,Shanghai Jiao Tong University,Shanghai 200240,China;MoE Key Lab of Arti cial Intelligence,Shanghai Jiao Tong University,Shanghai 200240,China;Shanghai Arti cial Intelligence Laboratory,Shanghai 200030,China)

机构地区:[1]上海交通大学计算机科学与工程系,上海200240 [2]上海交通大学人工智能教育部重点实验室,上海200240 [3]上海人工智能实验室,上海200030

出  处:《中国科学:信息科学》2023年第11期2250-2265,共16页Scientia Sinica(Informationis)

基  金:科技创新2030—“新一代人工智能”重大项目(批准号:2020AAA0107600);国家自然科学基金优秀青年基金项目(批准号:62222607);上海市级科技重大专项(批准号:2021SHZDZX0102)资助项目。

摘  要:有向目标检测是计算机视觉中的一个研究热点,在遥感、场景文字等领域具有广泛应用.大长宽比、密集排列以及任意方向等问题是该领域目标检测面临的主要挑战.本文提出了一种基于单阶段检测方法的级联有向检测器R3DetGauss,采用一种从粗到细的渐进式回归方法快速准确地定位目标.考虑到级联检测器中存在的特征不对齐的问题,本文设计了一个特征精修模块(feature refinement module,FRM),能够获得更准确的特征,从而提高检测性能.FRM通过逐像素特征插值将当前精修后的边界框的位置信息重新编码到对应的特征点,进而实现特征的重构和对齐.本文还采用了具有尺度不变性的归一化高斯Wasserstein距离作为回归损失来进一步提高估计边界框的质量.此外,本文基于该距离提出了长宽比感知的自适应样本采样策略,提高了样本分配的质量.在多个公开的图像数据集上的大量实验结果表明,所提出的R3DetGauss检测器在多种数据集上均能够进一步提升精度,并最终达到当前先进检测水平.相关代码在国产深度学习Jittor框架、PyTorch和TensorFlow中均进行了开源发布.Oriented object detection is a research hotspot in computer vision,and has a wide range of applications in remote sensing,scene text,etc.The problems of large aspect ratio,dense arrangement,and arbitrary orientation are the current main challenges in this eld.The authors present a re ned oriented detector,R3DetGauss,based on a single-stage detection method,which employs a coarse-to-ne progressive regression manner to locate objects quickly and accurately.Considering the issue of feature misalignment in re ned detectors,this paper designs a feature re nement module(FRM)to obtain more accurate features to improve the detection performance.Speci cally,FRM re-encodes the position information of the currently re ned bounding box to the corresponding feature points through pixel-wise feature interpolation,thereby realizing feature reconstruction and alignment.This paper also designs a scale-invariant normalized Gaussian Wasserstein distance as the regression loss to further improve the quality of the predicted bounding boxes.In addition,this paper proposes an aspect ratio-aware adaptive sampling strategy based on this distance,which improves the quality of sample allocation.A large number of quantitative and qualitative experimental results show that the devised R3DetGauss can improve existing baseline,and achieve state-of-the-art detection accuracy on a variety of datasets.The models and codes are implemented and released by the domestic open-source deep learning framework Jittor,together with PyTorch and TensorFlow.

关 键 词:有向目标检测 计算机视觉 特征精修模块 分布距离 标签分配 回归损失 

分 类 号:TP391.41[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象