基于FPGA的卷积神经网络加速器动态余数处理映射模型

A dynamic remainder processing mapping model for convolutional neural network accelerator on FPGA

作　　者：赵小强姜晶菲[1] 许金伟窦勇[1] ZHAO Xiao-qiang;JIANG Jing-fei;XU Jin-wei;DOU Yong(College of Computer Science and Technology,National University of Defense Technology,Changsha 410073,China)

机构地区：[1]国防科技大学计算机学院,湖南长沙410073

出　　处：《计算机工程与科学》2021年第9期1521-1528,共8页Computer Engineering & Science

基　　金：核高基国家重大专项(2018ZX01028101);预研项目(31513010602-1)。

摘　　要：将卷积计算转化为矩阵乘法是FPGA上一种高效实现,而现有的转化方法无法根据卷积参数的不同动态调整,限制了卷积计算的并行度。提出一种新的动态余数处理映射模型。该映射模型包含有3个子模型:特征值映射模型,权值映射模型,和输出映射模型。特征值映射模型将特征值转化为特征值矩阵,权值映射模型将权值转化为权值矩阵,特征值矩阵和权值矩阵通过乘累加计算阵列得到卷积计算结果,由输出映射模型将卷积计算结果存储到内存中。在卷积计算过程中,卷积的输出通道数通常不是乘累加计算阵列行数的整数倍,3个子映射模型会根据产生的余数动态调整映射方法,提高乘累加计算阵列的利用率。通过实验表明,采用动态余数处理映射模型能够将余数并行度的倍数至多提高到卷积核大小,使整个加速器达到了更高的实际吞吐量和能量效率。Mapping convolutions to matrix multiplications is an efficient implementation on FPGA.However,the existing conversion methods cannot be dynamically adjusted according to different convolution parameters,which limits the parallelism of convolution calculation.This paper proposes a novel dynamic residue processing mapping model.The mapping model contains three sub-models:feature mapping model,weight mapping model,and output mapping model.The feature mapping model converts features into a feature matrix,and the weight mapping model converts weights into a weight matrix.The feature matrix and the weight matrix obtain convolution calculation results by multiply-and-accumulate array,and the convolution calculation results are stored in the memory by the output mapping model.In the process of convolution calculation,the number of output channels of the convolution is usually not an integer multiple of the number of rows of the multiply-and-accumulate array.The three sub-mapping models will dynamically adjust the mapping method according to the remaining number to increase the utilization of the multiply-accumulated array.Experiments show that using the dynamic remainder processing mapping model can increase the multiple of parallelism up to the size of the convolution kernel and achieve higher actual throughput and energy efficiency.

关键词：卷积矩阵乘法 FPGA 动态余数处理

分类号：TP183[自动化与计算机技术—控制理论与控制工程]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于FPGA的卷积神经网络加速器动态余数处理映射模型

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于FPGA的卷积神经网络加速器动态余数处理映射模型

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索