机构地区:[1]华南理工大学机械与汽车工程学院/广东省汽车工程重点实验室,广东广州510640
出 处:《华南理工大学学报(自然科学版)》2025年第3期1-11,共11页Journal of South China University of Technology(Natural Science Edition)
基 金:工信部制造业高质量发展专项(R-ZH-023-QT-001-20221009-001);广州市科技计划项目(2023B01J0016)。
摘 要:基于柱形(Pillar)的单阶段点云3维目标检测算法凭借其较高的运行效率,在工业界得到了广泛的关注和应用。但对点云柱形量化造成的点云3维特征细粒度信息损失,导致这类算法对稀疏点云小目标的检测能力较弱。尽管部分研究对此问题提出了应对方法,但通常以较高的检测时间成本或者大目标检测精度作为代价。为此,该文提出了一种基于改进柱形特征编码的柱形点云目标检测算法。首先,构建可实现柱形单元内部点云局部与全局特征相结合的柱形特征编码网络,用于增强柱形量化特征的表征能力;然后,设计一个由2维稀疏卷积块与特征融合网络相结合的主干网络,用于融合多尺度的高级抽象语义特征和低级细粒度空间特征,防止过度关注小尺寸特征而降低大目标的检测性能;最后,在KITTI自动驾驶数据集上进行训练和测试,并对实验结果进行了可视化和消融研究。结果显示:所提算法在KITTI数据集的中等难度下,多个类别的平均精度均值达63.54%、平均方向相似性均值达70.72%,平均检测帧速率达31.5 f/s;与PointPillars、TANet和PiFEnet算法相比,该文算法的平均精度均值分别提高了2.44、2.05和2.38个百分点,平均方向相似性均值分别提高了4.69、0.68和7.83个百分点,在同类算法的对比中表现出工程应用潜力。Single-stage point cloud 3-dimensional object detection algorithms based on pillars have gained significant attention and widespread application in the industry due to their high operational efficiency.However,the loss of fine-grained information loss in 3-dimensional features of point clouds caused by pillar-based quantization results in weaker detection capabilities for small objects in sparse point clouds.Although some studies have proposed solutions to this problem,they often come at the cost of either greater detection time or compromised detection accuracy for large targets.For this reason,this paper proposed an enhanced pillar-based point cloud object detection algorithm with enhanced pillar feature encoding.Firstly,a pillar feature encoding network is constructed to combine local and global features of point clouds within pillar cells,enhancing the representation capability of pillar-quantized features.Then,a backbone network that combines 2-dimensional sparse convolutional blocks with a feature fusion network was designed to fuse multi-scale high-level abstract semantic features and low-level fine-grained spatial features,preventing excessive focus on small-size features and thus degrading the detection performance for large targets.Lastly,the model was trained and tested on the KITTI autonomous driving dataset,with experimental results visualized and ablation studies conducted.The results show that,the proposed algorithm,under the medium difficulty level of the KITTI dataset,has an average precision mean of 63.54%across multiple categories,an average orientation similarity mean of 70.72%,and an average detection frame rate of 31.5 f/s.Compared with the PointPillars,TANet,and PiFEnet,the average precision mean of the algorithm proposed in this paper has increased by 2.44,2.05,and 2.38 percentage points respectively,and the average orientation similarity mean has increased by 4.69,0.68,and 7.83 percentage points respectively,demonstrating potential for engineering applications in comparisons with similar
分 类 号:TP391[自动化与计算机技术—计算机应用技术] U495[自动化与计算机技术—计算机科学与技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...