基于FPGA的MobileNetV1目标检测加速器设计  

Design of MobileNetV1 object detection accelerator based on FPGA

在线阅读下载全文

作  者:严飞[1,2] 郑绪文 孟川 李楚 刘银萍 YAN Fei;ZHENG Xuwen;MENG Chuan;LI Chu;LIU Yinping(School of Automation,Nanjing University of Information Science and Technology,Nanjing 210044,China;Jiangsu Collaborative Innovation Center for Atmospheric Environment and Equipment Technology,Nanjing 210044,China;School of Emergency Management,Nanjing University of Information Science and Technology,Nanjing 210044,China)

机构地区:[1]南京信息工程大学自动化学院,江苏南京210044 [2]江苏省大气环境与装备技术协同创新中心,江苏南京210044 [3]南京信息工程大学应急管理学院,江苏南京210044

出  处:《现代电子技术》2025年第1期151-156,共6页Modern Electronics Technique

基  金:国家自然科学基金项目(61605083);江苏省重点研发计划项目(BE2020006-2)。

摘  要:卷积神经网络是目标检测中的常用算法,但由于卷积神经网络参数量和计算量巨大导致检测速度慢、功耗高,且难以部署到硬件平台,故文中提出一种采用CPU与FPGA融合结构实现MobileNetV1目标检测加速的应用方法。首先,通过设置宽度超参数和分辨率超参数以及网络参数定点化来减少网络模型的参数量和计算量;其次,对卷积层和批量归一化层进行融合,减少网络复杂性,提升网络计算速度;然后,设计一种八通道核间并行卷积计算引擎,每个通道利用行缓存乘法和加法树结构实现卷积运算;最后,利用FPGA并行计算和流水线结构,通过对此八通道卷积计算引擎合理的复用完成三种不同类型的卷积计算,减少硬件资源使用量、降低功耗。实验结果表明,该设计可以对MobileNetV1目标检测进行硬件加速,帧率可达56.7 f/s,功耗仅为0.603 W。The convolutional neural network is a commonly used algorithm in object detection.However,due to the large number of parameters and computation load of the convolutional neural network(CNN),the detection speed of the CNN is slow,its power consumption is high,and it is difficult to deploy the CNN at the hardware platform.In view of this,the paper proposes an application method using the fusion structure of CPU and FPGA to realize the acceleration of MobileNetV1 object detection.The parameter number and computation load of the network model are reduced by setting width hyperparameters and resolution hyperparameters,as well as performing network parameter fixed-point.The convolutional layer and batch normalization layer are fused to reduce network complexity and improve network computation speed.An eight-channel inter-kernel parallel convolution engine is designed.Row cache multiplication and addition tree structure are used to implement convolution operation in each channel.Finally,by utilizing FPGA parallel computing and pipeline structure,three different types of convolution calculation are realized by reasonable reuse of the eight-channel convolution computing engine,so as to reduce the consumption of hardware resources and power consumption.The experimental results show that the design can accelerate the MobileNetV1 object detection with a frame rate of 56.7 f/s and a power consumption of 0.603 W.

关 键 词:卷积神经网络 目标检测 FPGA MobileNetV1 并行计算 硬件加速 

分 类 号:TN492-34[电子电信—微电子学与固体电子学] TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象