检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:吴艳霞[1] 梁楷 刘颖[2] 崔慧敏[2] WU Yan-Xia;LIANG Kai;LIU Ying;CUI Hui-Min(College of Computer Science and Technology,Harbin Engineering University,Harbin 150001;State Key Laboratory of Computer Architecture,Institute of Computing Technology,Chinese Academy of Sciences,Beijing 100190)
机构地区:[1]哈尔滨工程大学计算机科学与技术学院,哈尔滨150001 [2]中国科学院计算技术研究所计算机体系结构国家重点实验,北京100190
出 处:《计算机学报》2019年第11期2461-2480,共20页Chinese Journal of Computers
基 金:国家“八六三”高技术研究发展计划项目(2016YFB1000402);黑龙江省自然科学基金(F2018008);哈尔滨市杰出青年人才基金(2017RAYXJ016)资助~~
摘 要:随着大数据时代的来临,深度学习技术在从海量数据中提取有价值信息方面发挥着重要作用,已被广泛应用于计算机视觉、语音识别及自然语言处理等领域.本文从深度学习算法的特点和发展趋势出发,分析FPGA加速深度学习的优势以及技术挑战;其次,本文从SoC FPGA和标准FPGA两个方面介绍了CPU-FPGA平台,主要对比分析了两种模型在CPU和FPGA之间数据交互上的区别;接下来,在介绍FPGA加速深度学习算法开发环境的基础上,重点从硬件结构、设计思路和优化策略这三个方面详细介绍了采用FPGA加速卷积神经网络的设计方案;最后展望了FPGA加速深度学习算法相关研究工作的发展.With the coming of big data era,the technology of deep learning plays a critical role in extracting the meaningful information from the massive data.Also,it has been widely applied in some domains,such as computer vision,speech recognition and natural language processing.As far as we all know,deep learning algorithms have a large number of parameters and relevant matrix multiplication or multiply-and-add operations with the simple computing model.At the same time,researches and industries need higher and higher accuracy,the complex models which have more and more weights won the image classification and object detection contest.To speed up the inference and training of deep learning becomes much more important.This paper mainly reviews one of the approaches,accelerating deep learning on FPGAs.Firstly,this paper introduces the deep learning algorithms and relative characteristics especially the convolutional neural network and recurrent neural network and why the FPGA approach can fit this problem,what people can be benefit compared with CPU only or GPU and what people should do to enable it.After that,this paper analyzes the challenges of accelerating deep learning on FPGAs.Additionally,the CPU-FPGA is the most welcome architecture at present,but there’re different methods to set up a usable platform.And for CPU-FPGA platforms,data communication is one of the important factors for affecting the performance of acceleration.Based on these,this paper introduces different methods from the aspects of SoC FPGA and standard FPGA,and comparatively analyzes the differences of the data communication between CPU and FPGA of the two methods.For different engineers and developers,the development environments and tools of accelerating deep learning on FPGAs are presented in this paper from the aspects of high-level language,such as C and OpenCL,and hardware description language,such as Verilog.It is not hard to use techniques which require many complex low-level hardware control operations for FPGA implementations to improve
关 键 词:深度学习 神经网络 CPU-FPGA 硬件加速 FPGA
分 类 号:TP302[自动化与计算机技术—计算机系统结构]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.44