检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:李政清[1] 穆继亮 LI Zheng-qing;MU Ji-liang(College of Science and Technology,University of Sanya,anya Hainan 570200 China;School of Instrument and Electronics,North University of China,Taiyuan Shanxi 030000,China)
机构地区:[1]三亚学院理工学院,海南三亚570200 [2]中北大学仪器与电子学院,山西太原030000
出 处:《计算机仿真》2022年第3期244-248,共5页Computer Simulation
摘 要:在FPGA数据处理应用场合中,引入神经网络能够提高数据特征的学习能力。但是基于非嵌入式的神经网络在运算过程中通常具有显著的复杂性和稀疏性,难以直接应用于FPGA上。于是,为了提高FPGA在数据处理时的并行性和高效性,设计了基于卷积网络加速器的FPGA数据处理架构。首先对卷积网络的层进行优化设计,采用ReLU函数来加速卷积层的收敛,同时采用平均池化方案增强网络适应性,通过卷积的尺度变换对特征图采取压缩,达到在一个层中并行计算的目的。然后对FPGA的处理模块和缓存模块进行优化设计,判定器对有效数据的权值索引和计数等参数采取验证,将大量的乘加操作递交给FPGA的DSP来处理;对特征图及其中间变量采取BRAM缓存,根据横向、纵向,以及深度分别采取分配。最后,对加速器执行过程中FPGA的资源利用和执行时间进行分析,通过资源和时间因素对加速器执行过程采取调整。实验结果表明,基于卷积网络加速器的FPGA数据处理方案提高了FPGA的资源利用率和有效算力,无论是在不同平台或是不同加速器的对比情况下,都能够获得更为优秀的数据处理性能。In FPGA data processing applications,the introduction of a neural network can improve the learning ability of data features.However,the non-embedded neural network usually has significant complexity and sparsity in the operation process,which is difficult to be directly applied to FPGA.Therefore,in order to improve the parallelism and efficiency of FPGA in data processing,an FPGA data processing architecture based on a convolutional network accelerator is designed.Firstly,the layer of the convolution network was optimized.The RLU function was used to accelerate the convergence of the convolution layer.At the same time,the average pooling scheme was used to enhance the adaptability of the network.The scale transformation of convolution was used to compress the feature map,so as to achieve the purpose of parallel computing in one layer.Then,the processing module and cache module of FPGA were optimized.The weighted index and count of valid data were verified by the decider,and a large number of multiply and add operations were submitted to the DSP of FPGA for processing;Bram cache was used for feature graph and its intermediate variables,and allocation was adopted according to horizontal,vertical and depth.Finally,the resource utilization and execution time of FPGA in the accelerator execution process were analyzed,and the accelerator execution process was adjusted by the resource and time factors.The experimental results show that the FPGA data processing scheme based on a convolutional network accelerator improves the resource utilization and effective computing power of FPGA,No matter in different platforms or different accelerators,we can get better data processing performance.
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.186