检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:王量子 黄妙华[1,2,3] 刘若璎 毕程程 胡永康 Wang Liangzi;Huang Miaohua;Liu Ruoying;Bi Chengcheng;Hu Yongkang(Hubei Key Laboratory of Advanced Technology for Automotive Components,Wuhan University of Technology,Wuhan 430070,Hubei,China;Hubei Collaborative Innovation Center for Automotive Components Technology,Wuhan University of Technology,Wuhan 430070,Hubei,China;Hubei Research Center for New Energy&Intelligent Connected Vehicle,Wuhan University of Technology,Wuhan 430070,Hubei,China)
机构地区:[1]武汉理工大学现代汽车零部件技术湖北省重点实验室,湖北武汉430070 [2]武汉理工大学汽车零部件技术湖北省协同创新中心,湖北武汉430070 [3]武汉理工大学湖北省新能源与智能网联车工程技术研究中心,湖北武汉430070
出 处:《激光与光电子学进展》2024年第18期403-412,共10页Laser & Optoelectronics Progress
基 金:国家重点研发计划(2018YFE0105500)。
摘 要:为解决路侧点云目标检测任务中复杂场景下远距离车辆漏检率高和道路行人误检率高等问题,提出一种改进PointPillars和Transformer的路侧两阶段三维目标检测算法。算法的第一阶段基于PointPillars设计:骨干网络嵌入SimAM注意力机制学习相似性信息以关注重要特征,替换下采样部分的普通卷积块为带有残差结构的卷积块以提高网络性能。第二阶段基于Transformer对第一阶段生成的候选框进行细化:编码器构建原始点特征进行编码,解码器利用通道加权增强通道信息,提高检测精度,改善误检问题。为验证所提算法的性能,在路侧数据集DAIR-V2X-I和车端数据集KITTI上进行实验。实验结果表明,所提算法相比其他公开算法检测准确率明显提升,同基准算法PointPillars相比,在moderate检测难度下,对DAIR-V2X-I数据集中汽车、行人、骑行者的检测准确率分别提高1.9百分点、10.5百分点、2.11百分点,KITTI数据集中汽车、行人、骑行者的检测准确率分别提高2.34百分点、4.73百分点、8.17百分点。This study proposes a two-stage three-dimensional object detection algorithm tailored for roadside scenes,aiming to address the challenges of high missed detection rates for long-distance vehicles and high false detection rates for pedestrians in complex scenes involved in cloud object detection tasks.This algorithm improves PointPillars and Transformer.In the first stage of the algorithm,the PointPillars-based backbone network incorporates the SimAM attention mechanism to capture similarity information,prioritizing essential features.This stage replaces standard convolutional blocks in the downsampling section with residual structures to improve network performance.The second stage of the algorithm utilizes Transformer to refine the candidate boxes generated in the first stage:the encoder constructs the original point features for encoding,while the decoder employs channel weighting to enhance channel information,thereby enhancing detection accuracy and mitigating false detection.The effectiveness of the proposed algorithm was tested on the DAIR-V2X-I roadside dataset and the KITTI vehicle-end dataset.Experimental results demonstrated substantial improvements in detection accuracy over other publicly available algorithms.Compared with the benchmark algorithm PointPillars,for moderate detection difficulty,accuracy improvements in detecting cars,pedestrians,and cyclists on the DAIR-V2X-I dataset were 1.9 percentage points,10.5 percentage points,and 2.11 percentage points,respectively.Moreover,corresponding improvements on the KITTI dataset were 2.34 percentage points,4.73 percentage points,and 8.17 percentage points,respectively.
关 键 词:三维目标检测 误检漏检 TRANSFORMER 注意力机制 残差结构
分 类 号:TP391.4[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.38