机构地区:[1]中国科学院大学计算机科学与技术学院,北京101408 [2]北京万集科技股份有限公司,北京100080 [3]北京信息科技大学计算机学院,北京100192
出 处:《中国图象图形学报》2024年第8期2399-2412,共14页Journal of Image and Graphics
基 金:国家自然科学基金项目(62271466)。
摘 要:目的基于点云的3D目标检测是自动驾驶领域的重要技术之一。由于点云的非结构化特性,通常将点云进行体素化处理,然后基于体素特征完成3D目标检测任务。在基于体素的3D目标检测算法中,对点云进行体素化时会导致部分点云的数据信息和结构信息的损失,降低检测效果。针对该问题,本文提出一种融合点云深度信息的方法,有效提高了3D目标检测的精度。方法首先将点云通过球面投影的方法转换为深度图像,然后将深度图像与3D目标检测算法提取的特征图进行融合,从而对损失信息进行补全。由于此时的融合特征以2D伪图像的形式表示,因此使用YOLOv7(you only look once v7)中的主干网络提取融合特征。最后设计回归与分类网络,将提取到的融合特征送入到网络中预测目标的位置、大小以及类别。结果本文方法在KITTI(Karlsruhe Institute of Technology and Toyota Technological Institute at Chicago)数据集和DAIR-V2X数据集上进行测试。以AP(average precision)值为评价指标,在KITTI数据集上,改进算法PP-Depth相较于PointPillars在汽车、行人和自行车类别上分别有0.84%、2.3%和1.77%的提升。以自行车简单难度为例,改进算法PP-YOLO-Depth相较于PointPillars、PP-YOLO和PP-Depth分别有5.15%、1.1%和2.75%的提升。在DAIR-V2X数据集上,PP-Depth相较于PointPillars在汽车、行人和自行车类别上分别有17.46%、20.72%和12.7%的提升。以汽车简单难度为例,PP-YOLO-Depth相较于PointPillars、PP-YOLO和PP-Depth分别有13.53%、5.59%和1.08%的提升。结论本文方法在KITTI数据集和DAIR-V2X数据集上都取得了较好表现,减少了点云在体素化过程中的信息损失并提高了网络对融合特征的提取能力和多尺度目标的检测性能,使目标检测结果更加准确。Objective Perception systems are integral components in modern autonomous driving systems.They are designed to accurately estimate the state of the surrounding environment and provide reliable observations for prediction and planning.3D object detection can intelligently predict the location,size,and category of key 3D objects near the autonomous vehicle,and it is an important part of the perception system.In 3D object detection,common data types include images and point clouds.Compared with images,a point cloud is a dataset composed of many points in a 3D space,and the position information of each point is represented by coordinates in a 3D coordinate system.In addition to position information,information such as reflection intensity is usually included.In the field of computer vision,point clouds are often used to represent the shape and structure information of 3D objects.Therefore,the 3D object detection method based on point cloud has more real spatial information and often has more advantages in detection accuracy and speed.However,the point cloud is often converted into a 3D voxel grid due to the unstructured nature of the point cloud.Each voxel in the voxel grid is regarded as a 3D feature vector.Then,the 3D convolutional network is used to extract the feature of the voxel,which completes the 3D object detection task based on the voxel feature.In the voxel-based 3D object detection algorithm,the voxelization of the point cloud will lead to the loss of data information and structural information of part of the point cloud,which affects the detection effect.We propose a method that combines point cloud depth informa⁃tion to solve this problem.Our method uses point cloud depth information as fusion information to complement the informa⁃tion lost in the voxelization process.It also uses the efficient YOLOv7-Net network to extract fusion features,improve the detection performance and feature extraction capabilities of multi-scale objects,and effectively increase the accuracy of 3D object detection.Method T
关 键 词:自动驾驶 3D点云目标检测 深度信息融合 点云体素化 KITTI数据集
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...