基于改进的YOLOv4模型的行人检测算法  被引量:2

Pedestrian Detection Algorithm Based on Improved YOLOv4 Model

在线阅读下载全文

作  者:巨志勇[1] 李玉明 薛永杰 叶雨新 赖颖 JU Zhiyong;LI Yuming;XUE Yongjie;YE Yuxin;LAI Ying(School of Optical Electrical and Computer Engineering,University of Shanghai for Science and Technology,Shanghai 200082,China)

机构地区:[1]上海理工大学光电信息与计算机工程学院,上海200082

出  处:《控制工程》2023年第10期1912-1926,共15页Control Engineering of China

基  金:国家自然科学基金资助项目(81101116)。

摘  要:为提高行人检测算法在实际应用中的准确率,提出在YOLOv4模型中融合Vision Transformer模型与深度可分离卷积的vit-YOLOv4模型。该模型将Vision Transformer模型加入YOLOv4模型的主干特征提取网络与空间金字塔池化层中,充分发挥该模型的多头注意力机制对图像特征进行预处理的能力;同时,用深度可分离卷积替换路径聚合网络中堆叠后的常规卷积,以便模型在后续的特征提取中能够提取出更多有用的特征。实验结果表明,vit-YOLOv4模型提高了行人检测的准确率,降低了漏检率,综合性能较优。In order to improve the accuracy of the pedestrian detection algorithm in practical application,the vit-YOLOv4 model is proposed by combining the Vision Transformer model and deep separable convolution in the YOLOv4 model.The Vision Transformer model is added into the backbone feature extraction network of the YOLOv4 model and spatial pyramid pooling(SPP)layer,which gives full play to the multi-head attention mechanism of the YOLOv4 model to preprocess image features.At the same time,the stacked conventional convolutions in the path aggregation network(PANet)are replaced by deep separable convolutions,so that the model can extract more useful features in the subsequent feature extraction.The experimental results show that the vit-YOLOv4 model can improve the accuracy of pedestrian detection and reduce the missed detection rate,and has excellent comprehensive performers.

关 键 词:行人检测 YOLOv4 Vision Transformer 深度可分离卷积 多头注意力机制 

分 类 号:TP18[自动化与计算机技术—控制理论与控制工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象