轻量化卷积神经网络硬件加速设计及FPGA实现  

Design and FPGA implementation of lightweight convolutional neural network hardware acceleration

在线阅读下载全文

作  者:李珍琪 王强[1] 齐星云[1] 赖明澈[1] 赵言亢 陆亿行 黎渊[1] LI Zhenqi;WANG Qiang;QI Xingyun;LAI Mingche;ZHAO Yankang;LU Yihang;LI Yuan(College of Computer Science and Technology,National University of Defense Technology,Changsha 410073,China)

机构地区:[1]国防科技大学计算机学院,湖南长沙410073

出  处:《计算机工程与科学》2025年第4期582-591,共10页Computer Engineering & Science

摘  要:近年来,卷积神经网络CNN在计算机视觉等领域取得了显著的成效。然而,通常CNN的网络结构复杂,计算量庞大,难以在计算资源和功耗受限的便携式设备上实现。而FPGA具有较高的并行度、能效比和可重构性,已成为在便携式设备上加速CNN推理最有效的计算平台之一。设计了一种可配置为不同网络结构的卷积神经网络加速器,并从数据复用、基于行缓存的流水线优化和基于加法树的低延迟卷积技术3个方面对加速器的延迟和功耗进行了优化。以轻量化神经网络YOLOv2-tiny为例,在领航者ZYNQ-7020开发板上构建了一个实时目标检测系统。实验结果表明,整个设计的资源消耗占用为88%,功耗消耗为2.959 W,满足便携设备低硬件消耗及低功耗设计要求,在416×256的图像分辨率下,实现了3.91 fps的检测速度。In recent years,convolutional neural networks(CNNs)have achieved remarkable results in fields such as computer vision.However,CNNs typically have complex network structures and substantial computational requirements,making it difficult to implement them on portable devices with limited computational resources and power consumption.FPGAs,with their high parallelism,energy efficiency,and reconfigurability,have emerged as one of the most effective computing platforms for accele-rating CNN inference on portable devices.This paper proposes a CNN accelerator that can be configured for different network structures,and optimizes its latency and power consumption through three aspects:data reuse,pipeline optimization based on row buffers,and low-latency convolution techniques based on adder trees.Taking the YOLOv2-tiny lightweight network model as an example,a real-time target detection system was built on the Navigator ZYNQ-7020 development board.The experimental results show that the design meets low hardware and power requirements for portable devices,with 88% resource consumption and 2.959 W power consumption.It achieves a detection speed of 3.91 fps at an image resolution of 416×256.

关 键 词:卷积神经网络 FPGA加速 加速器 便携设备 

分 类 号:TP183[自动化与计算机技术—控制理论与控制工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象