高能效低延迟的BNN硬件加速器设计  

Design of energy-efficient low-latency BNN hardware accelerator

在线阅读下载全文

作  者:周培培 杜高明 李桢旻 王晓蕾 ZHOU Peipei;DU Gaoming;LI Zhenmin;WANG Xiaolei(School of Microelectronics,Hefei University of Technology,Hefei 230601,China)

机构地区:[1]合肥工业大学微电子学院,安徽合肥230601

出  处:《合肥工业大学学报(自然科学版)》2024年第12期1655-1661,共7页Journal of Hefei University of Technology:Natural Science

基  金:国家重点研发计划资助项目(2018YFB2202604);安徽省高校协同创新资助项目(GXXT-2019-030)。

摘  要:针对二值化神经网络(binary neural network,BNN)硬件设计过程中大量0值引发计算量增加以及BNN中同一权值数据与同一特征图数据多次重复运算导致计算周期和计算功耗增加的问题,文章分别提出全0值跳过方法和预计算结果缓存方法,有效减少网络的计算量、计算周期和计算功耗;并基于现场可编程门阵列(field programmable gate array,FPGA)设计一款BNN硬件加速器,即手写数字识别系统。实验结果表明,使用所提出的全0值跳过方法和预计算结果缓存方法后,在100 MHz的频率下,设计的加速器平均能效可达1.81 TOPs/W,相较于其他BNN加速器,提升了1.27~4.34倍。There are a large number of zero values used in the operation of binary neural network(BNN)applications,which leads to the surge of computations,as well as computing delay and computing power caused by repeated operations of the same weight data and feature graph data in BNN.In this paper,the methods of all-zero skipping and precomputed result caching are proposed.The proposed methods can effectively reduce the computation cost,computing delay and computing power.In addition,a BNN hardware accelerator based on field programmable gate array(FPGA)is designed and applied to handwritten digit recognition system.The experimental results show that after applying the proposed methods,the average power efficiency of the accelerator can reach 1.81 TOPs/W at the frequency of 100 MHz,which is 1.27-4.34 times higher than that of other BNN accelerators.

关 键 词:二值化神经网络(BNN) 权值共享 重复运算 现场可编程门阵列(FPGA) 硬件加速器 

分 类 号:TN47[电子电信—微电子学与固体电子学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象