检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:祝琴[1,2] 韩沈阳 曾明如[2] 赖平红[3] 吴垂茂 胡玮轶 Zhu Qin;Han Shenyang;Zeng Mingru;Lai Pinghong;Wu Chuimao;Hu Weiyi(School of Public Policy and Management,Nanchang University,Nanchang 330036;School of Information Engineering,Nanchang University,Nanchang 330036;Jiangxi Provincial People's Hospital,Nanchang 330038)
机构地区:[1]南昌大学公共政策与管理学院,南昌330036 [2]南昌大学信息工程学院,南昌330036 [3]江西省人民医院,南昌330038
出 处:《汽车工程》2024年第12期2290-2302,共13页Automotive Engineering
基 金:国家自然科学基金(72164027)资助。
摘 要:针对复杂交通监控场景中视频车辆检测模型难以提取丰富的目标特征的问题,本文从充分利用视频图像时空特征信息的角度,新建时空特征融合模块SF-Module,运用Transformer模型中的多头自注意力机制实现视频车辆图像当前帧和历史帧时空特征信息的提取和融合,丰富目标的特征信息;在此基础上,基于YOLOv8网络,在其颈部网络融合新建的时空特征融合模块SF-Module,挖掘视频图像序列的时空特征信息;同时,引入WIoU损失函数作为预测框回归损失,减少低质量标注框产生的有害梯度,设计SFW-YOLOv8视频车辆检测模型。最后,新建的SFW-YOLOv8复杂场景视频车辆检测模型在UA-DETRAC数据集上进行实验,对数据集中的部分图片进行了模拟雨天和雾天的数据增强,提高车辆检测模型的泛化性。实验结果表明,SFW-YOLOv8视频车辆检测模型的MAP50和MAP50:5:95值为79.1%和63.6%,较YOLOv8模型分别提高了1.7%和3.3%,推理速度为11 ms/帧,具有较为优秀的检测性能。For the problem that it is difficult for video vehicle detection models to extract rich target features in complex traffic monitoring scenarios,in this paper a new spatial-temporal feature fusion module SF-Module is established from the perspective of making full use of spatial-temporal feature information of video images.The multi-head self-attention mechanism in Transformer model is used to extract and fuse the temporal and spatial feature information of current and historical frames of video vehicle images to enrich the feature information of the target.On this basis,based on YOLOv8 network,the newly created spatio-temporal feature fusion module SF-Module is integrated in its neck network to mine spatio-temporal feature information of video image sequences.At the same time,the WIoU loss function is introduced as the prediction frame regression loss to reduce the harmful gradient generated by the low quality label frame,and the SFW-YOLOv8 video vehicle detection model is designed.Finally,the newly established SFW-YOLOv8 complex scene video vehicle detection model is tested on the UA-DETRAC dataset,and some images in the dataset are simulated to enhance the data on rainy and foggy days,so as to improve the generalization of the vehicle detection model.The experimental results show that the values of mAP50 and mAP50:5:95 of the SFW-YOLOv8 video vehicle detection model are 79.1%and 63.6%,which are 1.7%and 3.3%higher than that of the YOLOv8 model,respectively.The reasoning speed is 11 ms/frame,which has excellent detection performance.
关 键 词:车辆目标检测 时空特征融合 TRANSFORMER YOLOv8 注意力机制
分 类 号:TP391.41[自动化与计算机技术—计算机应用技术] TP183[自动化与计算机技术—计算机科学与技术] U495[交通运输工程—交通运输规划与管理]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.15