基于改进实时检测Transformer的塔机上俯视场景小目标检测模型  

Small target detection model in overlooking scenes on tower cranes based on improved real-time detection Transformer

在线阅读下载全文

作  者:庞玉东 李志星 刘伟杰 李天昊 王宁宁 PANG Yudong;LI Zhixing;LIU Weijie;LI Tianhao;WANG Ningning(School of Mechanical,Electrical and Vehicle Engineering,Beijing University of Civil Engineering and Architecture,Beijing 102616,China;School of Intelligent Manufacturing,Luoyang Institute of Science and Technology,Luoyang Henan 471023,China)

机构地区:[1]北京建筑大学机电与车辆工程学院,北京102616 [2]洛阳理工学院智能制造学院,河南洛阳471023

出  处:《计算机应用》2024年第12期3922-3929,共8页journal of Computer Applications

基  金:北京市属高校基本科研业务费资助项目(X21053);河南省高等学校重点科研项目(23A460020);河南省自然科学基金资助项目(242300420044)。

摘  要:针对塔机吊钩相互碰撞导致物体跌落以及塔机倒塌致使人员伤亡等一系列施工现场人员安全保障的问题,提出一种基于改进实时检测Transformer(Real-Time DEtection TRansformer,RT-DETR)的塔机上俯视场景小目标检测模型。首先,在原始模型中加入应用模型的重参数化思想设计的多路训练和单路推理结构以提升检测速度;其次,重新设计FasterNet Block中的卷积模块替换原始BackBone之中的BasicBlock以提升检测模型性能;再次,利用新的损失函数Inner-SIoU(Inner-Structured Intersection over Union)进一步提升模型精度与收敛速度;最后,进行消融实验与对比实验验证模型性能。结果表明,在检测塔机顶部俯视小目标图像时,所提模型的精度达到94.7%,高于原始RT-DETR模型6.1个百分点;所提模型的每秒检测帧数(FPS)达到59.7,检测速度相较于原模型提升了21%。在公共数据集COCO 2017上所提模型的平均精度(AP)比YOLOv5、YOLOv7和YOLOv8分别高2.4、1.5和1.3个百分点。可见所提模型满足塔机上俯视场景下的小目标检测精度和速度的要求。In view of a series of problems of security guarantee of construction site personnel such as casualties led by falling objects and tower crane collapse caused by mutual collision of tower hooks,a small target detection model in overlooking scenes on tower cranes based on improved Real-Time DEtection TRansformer(RT-DETR)was proposed.Firstly,the multiple training and single inference structures designed by applying the idea of model reparameterization were added to the original model to improve the detection speed.Secondly,the convolution module in FasterNet Block was redesigned to replace BasicBlock in the original BackBone to improve performance of the detection model.Thirdly,the new loss function Inner-SIoU(Inner-Structured Intersection over Union)was utilized to further improve precision and convergence speed of the model.Finally,the ablation and comparison experiments were conducted to verify the model performance.The results show that,in detection of the small target images in overlooking scenes on tower cranes,the proposed model achieves the precision of 94.7%,which is higher than that of the original RT-DETR model by 6.1 percentage points.At the same time,the Frames Per Second(FPS)of the proposed model reaches 59.7,and the detection speed is improved by 21%compared with the original model.The Average Precision(AP)of the proposed model on the public dataset COCO 2017 is 2.4,1.5,and 1.3 percentage points higher than those of YOLOv5,YOLOv7,and YOLOv8,respectively.It can be seen that the proposed model meets the precision and speed requirements for small target detection in overlooking scenes on tower cranes.

关 键 词:目标检测 RT-DETR 小目标 TRANSFORMER 计算机视觉 注意力机制 

分 类 号:TP391.41[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象