基于BEV占位预测的激光-毫米波雷达融合目标检测算法  

LiDAR-Radar Fusion Object Detection Algorithm Based on BEV Occupancy Prediction

在线阅读下载全文

作  者:李越豪 王邓江 鉴海防[1] 王洪昌 程清华 LI Yuehao;WANG Dengjiang;JIAN Haifang;WANG Hongchang;CHENG Qinghua(Laboratory of Solid State Optoelectronics Information Technology,Institute of Semiconductors,Chinese Academy of Sciences,Beijing 100083,China;College of Materials Science and Opto-Electronic Technology,University of Chinese Academy of Sciences,Beijing 101499,China;Beijing VanJee Technology Suzhou R&D Institute,Suzhou,Jiangsu 215133,China)

机构地区:[1]中国科学院半导体研究所固态光电信息技术实验室,北京100083 [2]中国科学院大学材料科学与光电技术学院,北京101499 [3]北京万集科技股份有限公司苏州研究院,江苏苏州215133

出  处:《计算机科学》2024年第6期215-222,共8页Computer Science

基  金:科技创新2030-“新一代人工智能”重大项目(2022ZD0116300)。

摘  要:激光雷达工作环境中的光束衰减和目标遮挡会导致输出点云出现远端稀疏的问题,从而引起基于激光雷达的3D目标检测算法的检测精度随距离衰减的现象。针对这一问题,提出了一种基于鸟瞰图视角(BEV)空间内目标占位预测的激光-毫米波雷达融合目标检测算法。首先提出了一种简化的BEV占位预测子网络,用于生成位置相关的毫米波雷达特征,同时有助于解决毫米波雷达数据稀疏带来的网络收敛困难的问题。然后,为了实现跨模态特征融合,设计了一种基于BEV空间特征关联的多尺度激光-毫米波雷达特征融合层结构。在nuScenes数据集上进行实验,结果表明,所提出的毫米波雷达分支网络的平均检测精度(mAP)达到21.6%,推理时间为8.3ms。在加入融合层结构后,多模态检测算法较基线算法CenterPoint的mAP提升了2.9%,同时增加的额外推理时间开销仅为8.6ms,在距离传感器30m位置处,多模态算法对于nuScenes数据集中10个类别的检测精度达成率分别较CenterPoint提升了2.1%~16.0%。Beam attenuation and target occlusion in the working environment of LiDAR can cause the output point cloud to be sparse at the far end,which leads to the phenomenon of detection accuracy degradation with distance for 3D object detection algorithms based on LiDAR.To address this problem,a LiDAR-radar fusion object detection algorithm based on BEV occupancy prediction is proposed.First,a simplified bird’s eye view(BEV)occupancy prediction sub-network is proposed to generate position-related radar features,which also helps to solve the network convergence difficulty problem caused by the sparsity of radar data.Then,in order to achieve cross-modal feature fusion,a multi-scale LiDAR-radar fusion layer based on BEV space feature correlation is designed.Experimental results on the nuScenes dataset show that the mean average precision(mAP)of the proposed radar branch network reaches 21.6%,and the inference time is 8.3ms.After adding the fusion layer structure,the mAP of the multi-modal detection algorithm improves by 2.9%,compared to the baseline algorithm CenterPoint,and the additional inference time overhead is only 8.6ms.At the 30m position of the distance sensor,the detection accuracy of the multi-modal algorithm for 10 categories in the nuScenes dataset increases by 2.1%~16.0%compared to CenterPoint respectively.

关 键 词:3D目标检测 激光雷达 毫米波雷达 占位预测 鸟瞰图视角 特征融合 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象