基于体素自注意力辅助网络的三维目标检测  

3D Object Detection Based on Voxel Self-Attention Auxiliary Networks

在线阅读下载全文

作  者:曹捷 彭忆强 樊利康[1,2,3] 王龙飞 Cao Jie;Peng Yiqiang;Fan Likang;Wang Longfei(School of Automobile and Transportation,Xihua University,Chengdu 610039,Sichuan,China;Vehicle Measurement Control and Safety Key Laboratory of Sichuan Province,Xihua University,Chengdu 610039,Sichuan,China;Provincial Engineering Research Center for New Energy Vehicle Intelligent Control and Simulation Test Technology of Sichuan,Chengdu 610039,Sichuan,China)

机构地区:[1]西华大学汽车与交通学院,四川成都610039 [2]西华大学汽车测控与安全四川省重点实验室,四川成都610039 [3]四川省新能源汽车智能控制与仿真测试技术工程研究中心,四川成都610039

出  处:《激光与光电子学进展》2024年第24期141-150,共10页Laser & Optoelectronics Progress

基  金:四川省科技创新基地建设项目(2022ZYD0125);自然科学青年科学基金(52205129)。

摘  要:针对目前依赖于卷积神经网络(CNN)的激光雷达目标检测算法对自动驾驶场景的空间结构理解不深刻导致检测效果差的问题,提出了一种能够增强特征提取能力、可直接应用于大部分基于体素的检测算法的体素自注意力辅助(VSAA)网络。首先,VSAA网络在体素特征编码的基础上进一步构造体素哈希表对体素进行二次编码,在后续自注意力计算过程中有效提高了搜索相关体素的效率;然后,VSAA网络将自注意力机制应用到体素层面,从而获取到丰富的全局信息和深层次的上下文语义信息;最后,将VSAA网络应用在基准算法SECOND和PV-RCNN上,进而提出了VA-SECOND和VA-PVRCNN算法,并通过融合VSAA网络与CNN特征弥补了CNN感受野小的缺点,增强了检测算法对整个空间场景的理解能力。在KITTI数据集上的实验结果表明:相比于基准算法,VA-SECOND和VA-PVRCNN算法对所有检测目标的平均检测精度分别提高了1.16百分点和1.54百分点,证明了VSAA网络的有效性。A voxel self-attention auxiliary(VSAA) network is proposed to address the issue of poor detection performance in LiDAR object detection algorithms for autonomous driving scenes.This issue stems from a lack of deep understanding of the spatial structure,owing to its reliance on a convolutional neural network(CNN).VSAA network can be directly applied to most voxel-based target detection algorithms to enhance its feature extraction capabilities.First,the VSAA network enhances the efficiency of searching relevant voxels in subsequent self-attention calculations by further constructing voxel hash tables for secondary encoding,based on the foundation of voxel feature encoding.Second,VSAA network applies the self-attention mechanism at the voxel level to capture comprehensive global information and profound contextual semantic information.Finally,this study proposes the VA-SECOND and VA-PVRCNN algorithms by applying VSAA network to the benchmark algorithms SECOND and PV-RCNN,respectively.The features of VSAA network and CNN are fused to compensate for the disadvantage of the small receptive field of the CNN,thus enhancing the detection ability of the algorithm and allowing it to understand an entire spatial scene.Experimental results obtained using the KITTI dataset show that,compared with the benchmark algorithms,VA-SECOND and VA-PVRCNN algorithms improve the average detection accuracy of all detected targets by 1.16 percentage point and 1.54 percentage point,respectively,which proves the effectiveness of the VSAA network.

关 键 词:激光雷达 目标检测 自动驾驶 体素 自注意力 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象