检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:郑锦[1,2] 王森 李航 周裕海 ZHENG Jin;WANG Sen;LI Hang;ZHOU Yu-Hai(School of Computer Science and Engineering,Beihang University,Beijing 100191;State Key Laboratory of Virtual Reality Technology and Systems,Beijing 100191)
机构地区:[1]北京航空航天大学计算机学院,北京100191 [2]虚拟现实技术与系统全国重点实验室,北京100191
出 处:《计算机学报》2024年第12期2803-2818,共16页Chinese Journal of Computers
摘 要:当前主流的单目相机3D目标检测网络采用关键点检测范式,存在关键点预测与深度估计不准确的问题,限制了单目3D检测器的性能表现.本文提出一种多关键点约束与深度估计辅助的单目3D目标检测算法Mono-Aux,利用3D检测框的角点投影点、上表面与下表面中心投影点作为3D框中心投影点的补充,通过多关键点约束提升关键点预测精度;提出一种LiDAR-Free解耦深度估计方法,在不引入激光点云数据的同时通过几何关系推导引入额外的深度估计辅助监督信号,提升深度估计的准确性.多关键点约束与深度估计辅助仅在训练阶段使用,推理阶段不引入额外的计算成本.在KITTI3D目标检测验证集和测试集上的结果显示,相较于MonoDLE基线网络,提出的MonoAux算法在目标检测精度上分别提高3.87%和4.64%,与其他SOTA方法相比,本文方法也具有显著的性能优势,甚至优于部分使用额外数据的方法.The mainstream monocular 3D object detection algorithms typically rely on a keypointbased paradigm.While widely adopted,these approaches often face challenges in accurately predicting keypoints and estimating depth,which ultimately limit the performance of monocular 3D detectors.The core problem lies in the inherent difficulty of generating precise keypoints and depth values from a single 2D image.This paper introduces a novel solution to these issues,which is a monocular 3D detector named MonoAux that incorporates multi-keypoint constraints and depth estimation assistance.Traditional monocular 3D detection algorithms generally use the center projection point of the 3D bounding box as the primary keypoint for detection and localization tasks.However,relying solely on this center point often leads to suboptimal results,as it doesn’t fully capture the spatial characteristics of the object.To improve the precision of keypoint prediction,MonoAux introduces multiple keypoints into the process.Specifically,it uses the corner points of the 3D bounding box and the center points of both the upper and lower surfaces of the bounding box.These additional keypoints serve as supplementary constraints to improve the prediction of keypoint prediction,and thus enhance the algorithm’s ability to accurately estimate the object’s orientation and shape in 3D space.By improving the prediction of these keypoints,MonoAux is able to generate more accurate 3D bounding boxes,which in turn improves the object detection performance.In addition to the multi-keypoint constraints,MonoAux introduces a novel approach to depth estimation that operates entirely without the use of LiDAR data.Many state-of-the-art(SOTA)3D object detection methods rely on LiDAR point clouds to obtain accurate depth information,but this can be computationally expensive and requires specialized hardware.MonoAux tackles this challenge by proposing a LiDAR-free decoupling depth estimation method,which enhances the accuracy of depth estimation using only the geometri
关 键 词:3D目标检测 关键点预测 角点投影点 深度估计 激光点云
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.49