面向边缘计算的轻量级网络硬件加速设计被引量：1

Lightweight Network Hardware Acceleration Design for Edge Computing

作　　者：余运俊[1] 张鹏飞龚汉城陈敏 YU Yunjun;ZHANG Pengfei;GONG Hancheng;CHEN Min(School of Information Engineering,Nanchang University,Nanchang 330000,China;Jiangxi Jiangtou Digital Economy Research Institute,Nanchang 330000,China)

机构地区：[1]南昌大学信息工程学院,南昌330000 [2]江西江投数字经济研究院,南昌330000

出　　处：《计算机科学》2023年第S02期820-826,共7页Computer Science

基　　金：国家国际科技合作专项(2014DFG72240);江西省重点研发计划项目(20214BBG74006)。

摘　　要：随着边缘设备数据的增多和神经网络的不断落地应用,边缘计算为以云计算为核心的大数据技术分担了压力。现场可编程门阵列(FPGA)因灵活的体系结构和低功耗,在边缘计算以及构建神经网络加速器中显示出优异的特性。但是,传统的基于传统卷积算法的FPGA解决方案往往受到片上计算单元数量的限制。使用Zynq作为硬件加速平台,对参数进行定点量化,利用数组分区提高流水线运行速度。采用Winograd快速卷积算法对传统的卷积进行改进,将卷积运算中的乘法运算转换为加法运算,降低了模型的计算复杂度,极大提高了所设计的加速器的计算性能。实验表明,XC7Z035工作在150MHz时钟下获得了43.5GOP/s的性能,能效是Xeon(R)Silver 4214R的129倍,是双核ARM的159倍。所提方案在资源和功耗受限的情况下可以提供较高的性能,适用于网络边缘端对轻量级神经网络的落地应用。With the increase of edge device data and the continuous application of neural networks,the rise of edge computing has shared the pressure on big data technologies with cloud computing as the core.Field programmable gate arrays(FPGAs)have shown excellent properties in edge computing and building neural network accelerators due to their flexible architecture and low power consumption.But traditional FPGA solutions based on traditional convolution algorithms are often limited by the number of on-chip computing units.In this paper,Zynq is used as a hardware acceleration platform,to quantize parameters at a fixed point,and array partitioning is used to improve pipeline running speed.The Winograd fast convolution algorithm is used to improve the traditional convolution,and the multiplication operation in the convolution operation is converted into an addition operation,which reduces the computational complexity of the model.The computational performance of the designed accelerator is greatly improved.Experiments show that XC7Z035 can achieve 43.5GOP/s performance under 150 MHz clock,and the energy efficiency is 129 times of Xeon(R)Silver 4214R and 159 timesof dual-core ARM.The proposedsolution is limited in resources and power consumption.It can provide high performance and is suitable for the landing application of lightweight neural networks at the edge of the network.

关键词：边缘计算硬件加速轻量级卷积神经网络 Winograd FPGA

分类号：TP391[自动化与计算机技术—计算机应用技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

面向边缘计算的轻量级网络硬件加速设计被引量：1

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

面向边缘计算的轻量级网络硬件加速设计 被引量：1

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

面向边缘计算的轻量级网络硬件加速设计被引量：1