改进YOLOv4的人脸口罩检测与硬件加速  被引量:1

Improved YOLOv4 face mask detection and hardware acceleration

在线阅读下载全文

作  者:苏文俊 张学军[1,2,3] 许先富 谭伊璇 李斌 班艳娇 SU Wen-jun;ZHANG Xue-jun;XU Xian-fu;TAN Yi-xuan;LI Bin;BAN Yan-jiao(School of Computer and Electronic Information,Guangxi University,Nanning 530004,China;School of Medicine,Guangxi University,Nanning 530004,China;Key Laboratory of Multimedia Communication and Network Technology,Guangxi University,Nanning 530004,China)

机构地区:[1]广西大学计算机与电子信息学院,广西南宁530004 [2]广西大学医学院,广西南宁530004 [3]广西大学多媒体通信与网络技术重点实验室,广西南宁530004

出  处:《计算机工程与设计》2023年第3期798-806,共9页Computer Engineering and Design

基  金:广西科技重点基金项目(2020AA21077007);广西研究生教育创新基金项目(YCBZ2020026)。

摘  要:针对YOLOv4的人脸口罩检测参数量和计算量大,难以部署到硬件资源有限的嵌入式设备问题,提出一种轻量型YOLOv4算法,并设计卷积神经网络硬件加速器。将骨干网络替换成MobileNetv2,使用深度可分离卷积替换掉部分普通卷积,压缩网络结构;改进SPP模块以满足Vitis AI支持的池化窗口尺寸;在颈部网络中,增加CSP结构使网络更容易优化。实验结果表明,改进的算法牺牲0.25%的检测精度,压缩84.42%的模型大小。在ZYNQ上,mAP达到95.16%,DPU平均利用率减少38%。For YOLOv4’s face mask detection with a large number of parameters and computation,it is difficult to deploy to embedded devices with limited hardware resources.A lightweight YOLOv4 algorithm was proposed and a hardware accelerator for convolutional neural networks was designed.The backbone network was replaced with MobileNetv2,a part of the ordinary convolution was replaced with depthwise separable convolution,and the network structure was compressed.The SPP module was improved to meet the pooling window size supported by Vitis AI.In the neck network,the addition of the CSP structure made the network easier to optimize.Experimental results show that the improved algorithm sacrifices 0.25%of detection accuracy and compresses 84.42%of model size.On ZYNQ,the mAP reaches 95.16%,and the average DPU utilization is reduced by 38%.

关 键 词:现场可编程门阵列 硬件加速 轻量化网络 YOLOv4 空间金字塔池化 人脸口罩检测 深度可分离卷积 

分 类 号:TP391.41[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象