基于FPGA的卷积神经网络反向加速设计与实现  被引量:1

Design and realization of FPGA-based reverse acceleration of CNN

在线阅读下载全文

作  者:孙传猛 邓慧芳 王燕平 许瑞嘉 SUN Chuanmeng;DENG Huifang;WANG Yanping;XU Ruijia(State Key Laboratory of Dynamic Measurement Technology,Taiyuan 030051,China;School of Electrical and Control Engineering,North University of China,Taiyuan 030051,China;School of Physics and Astronomy,Sun Yat-sen University,Zhuhai 510275,China)

机构地区:[1]省部共建动态测试技术国家重点实验室,山西太原030051 [2]中北大学电气与控制工程学院,山西太原030051 [3]中山大学物理与天文学院,广东珠海510275

出  处:《传感器与微系统》2023年第8期107-110,共4页Transducer and Microsystem Technologies

基  金:山西省基础研究计划面上资助项目(202203021221106,202103021224199);山西省研究生创新项目(2022Y634)。

摘  要:卷积神经网络(CNN)计算量大,网络训练时间长,借助现场可编程门阵列(FPGA)良好的并行性,可加速其反向网络的参数训练过程,从而加快训练速度。首先,所设计的CNN反向加速网络具有良好的加速性能;然后,设计了CNN的FPGA硬件加速系统和一个加速计算模块,且用Kroneck乘积计算来减少CNN参数和时间复杂度;最后,进行Zedboard纯PS训练和使用反向网络PL加速器训练CNN的对比实验。实验结果表明:基于FPGA CNN的反向加速性能是仅在纯CPU上进行神经网络加速的111.15倍。Convolutional neural network(CNN)has large amount of computation and long training time,with good parallelism of field programmable gate array(FPFA),parameter training process of its reverse network can be accelerated,so as to speed up training speed.Firstly,the designed CNN reverse acceleration network has good acceleration performance.Then,a FPGA hardware acceleration system of CNN and an accelerated computing module are designed,and Kroneck product is used to reduce the parameters and time complexity of CNN.Finally,the comparative experiment between the pure PS training of Zedboard and the training of CNN with reverse network PL accelerator is carried out.The experimental results show that the reverse acceleration performance of CNN based on FPGA is 111.15 times that of neural network only on CPU.

关 键 词:反向加速 卷积神经网络 现场可编程门阵列 硬件加速 Kroneck乘积 

分 类 号:TP331[自动化与计算机技术—计算机系统结构]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象