检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:权良华 王艺霖 黎思越 李世平 陈铠 邓松峰 何国强 冯书谊 傅玉祥[2] 李丽[1] QUAN Lianghua;WANG Yilin;LI Siyue;LI Shiping;CHEN Kai;DENG Songfeng;HE Guoqiang;FENG Shuyi;FU Yuxiang;LI Li(School of Electronic Science and Engineering,Nanjing University,Nanjing 210023,China;School of Integrated Circuits,Nanjing University,Suzhou 215163,China;Jiangsu Huachuang Microsystem Co.,Ltd.,Nanjing 211800,China;Shanghai Institute of Spacecraft Electronics Technology,Shanghai 201108,China)
机构地区:[1]南京大学电子科学与工程学院,南京210023 [2]南京大学集成电路学院,江苏苏州215163 [3]江苏华创微系统有限公司,南京211800 [4]上海航天电子技术研究所,上海201108
出 处:《电子与封装》2024年第9期59-65,共7页Electronics & Packaging
基 金:国家自然科学基金青年科学基金(62104098);国家自然科学基金企业创新发展联合基金重点项目(U21B2032);江苏省基础研究专项(自然科学基金)青年基金(BK20210178);科技部重点研发计划(2021YFB3600104,2023YFB2806802)。
摘 要:提出了可重构智能加速核架构,并设计了可重构激活函数乘累加单元(ACT-MAC),旨在提高低功耗约束下的运算资源利用率。加速核基于ACT-MAC设计了可重构计算阵列,支持卷积、池化、长短期记忆网络(LSTM)及激活函数等算法的硬件加速。加速核采用乒乓流水线设计,优化了存储分配,显著提升了数据处理效率。该加速核通过协处理器指令拓展(NICE)接口与开源RISC-V处理器集成,形成了完整的片上系统(So C)。该设计在Nexys Video可编程逻辑门阵列(FPGA)中实现了芯片原型,并在其上部署了LeNet、VGG16和LSTM网络,展示了该So C原型芯片在图像分类和语义识别等领域的应用潜力。与最近的工作相比,该设计在提升数字信号处理(DSP)效率并维持高能效比的同时,支持多种人工智能算法的硬件加速,展现了在嵌入式应用场景中的广阔应用前景。A reconfigurable intelligent acceleration core architecture is proposed and a reconfigurable activation function multiply-accumulate unit(ACT-MAC)is designed,aiming at improving the utilization of computing resources under low power constraints.A reconfigurable computing array based on ACT-MAC is designed in the acceleration core,and hardware acceleration of algorithms such as convolution,pooling,long short-term memory(LSTM)and activation function is supported.The acceleration core utilizes a ping-pong pipeline design to optimize memory allocation,significantly enhancing data processing efficiency.This acceleration core is integrated with the open-source RISC-V processor through the nuclei instruction co-unit extension(NICE)interface,forming a complete system on chip(SoC).The design is implemented as a chip prototype on the Nexys Video field programmable gate array(FPGA),on which networks such as LeNet,VGG16,and LSTM are deployed,demonstrating the application potential of this SoC prototype chip in areas such as image classification and semantic recognition.In comparison with recent approaches,this design achieves enhanced digital signal processing(DSP)efficiency while retaining a high energy efficiency ratio,and facilitates hardware acceleration for a multitude of artificial intelligence algorithms,thereby highlighting its broad potential in embedded application scenarios.
关 键 词:RISC-V 可重构计算 非线性计算 人工智能 SOC
分 类 号:TN47[电子电信—微电子学与固体电子学] TN402
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.170