Myvoxel R-CNN:基于体素的三维点云目标检测模型  

Myvoxel R-CNN:3D Point Cloud Object Detection Model Based on Voxelization

在线阅读下载全文

作  者:韩建栋[1,2] 范学媛 HAN Jiandong;FAN Xueyuan(School of Computer and Information Technology,Shanxi University,Taiyuan 030006,China;Key Laboratory of Computational Intelligence and Chinese Information Processing,Ministry of Education,Shanxi University,Taiyuan 030006,China)

机构地区:[1]山西大学计算机与信息技术学院,太原030006 [2]山西大学计算智能与中文信息处理教育部重点实验室,太原030006

出  处:《小型微型计算机系统》2024年第8期1908-1913,共6页Journal of Chinese Computer Systems

基  金:山西省自然科学基金项目(20210302123443)资助.

摘  要:围绕目前三维点云目标检测中存在的特征提取不充分、困难(Hard)目标检测准确率低、模型泛化能力有待提高等问题,提出了一种新的单模态三维点云目标检测模型Myvoxel R-CNN,该模型由3个主要模块组成,分别是3D主干网络、2D鸟瞰区域建议网络(2D主干网络+区域建议网络(RPN))以及检测头,在3D主干网络中添加了多头自注意力模块和基于稀疏卷积的残差块,增强了3D主干网络的体素特征学习能力,捕获了更多数据和特征内部的相关性.设计了一个由注意力融合模块组成的2D主干网络,增加了原模型对2D特征的关注度.为了进一步增加所提出模型的泛化性,引入了一种新的数据增强方案——随机局部金字塔数据增强方法,以形状感知的方式生成增强对象样本.在KITTI数据集上,本模型对汽车Hard级别的检测精度AP 3D提升了约2.23%,此外简单(Easy)和中等(Moderate)类别分别提高了约0.60%和0.62%,对行人Easy级别的检测精度AP 3D、AP BEV分别提升了约0.62%和0.86%,Hard级别的AP 3D、AP BEV分别提升了约1.45%和1.53%,实验结果表明,Myvoxel R-CNN在KITTI数据集上的表现优于其他方法.Aiming at the problems of insufficient feature extraction,low accuracy of hard object detection and low generalization ability of the model in the current 3D point cloud object detection,a new single-modal 3D point cloud object detection model Myvoxel R-CNN is proposed.The model consists of three main modules,namely 3D backbone network,2D aerial view region proposal network(2D backbone network+region proposal network)and detection head.The multi-head self-attention module and the residual block based on sparse convolution are added to the 3D backbone network,which enhances the voxel feature learning ability of the 3D backbone network.A 2D backbone network composed of attention fusion modules is designed to increase the attention of the original model to 2D features.In order to further increase the generalization of the proposed model,a new data enhancement scheme,random local pyramid data enhancement method,is introduced to generate enhanced object samples in a shape-aware manner.On the KITTI dataset,the detection accuracy of the model for the hard level of the car is improved by about 2.23%.In addition,the simple(Easy)and moderate(Moderate)categories are improved by about 0.60%and 0.62%respectively,the detection accuracy of pedestrian Easy level is improved by about 0.62%and 0.86%respectively,and the Hard level is improved by about 1.45%and 1.53%respectively.The experimental results show that Myvoxel R-CNN performs better than other methods on the KITTI dataset.

关 键 词:三维目标检测 点云 注意力 残差块 数据增强 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象