检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:师雨洁 杨轲翔 刘旭东 何虎[1] SHI Yujie;YANG Kexiang;LIU Xudong;HE Hu(School of Integrated Circuits,Tsinghua University,Beijing 100084,China)
出 处:《微电子学与计算机》2024年第5期109-116,共8页Microelectronics & Computer
摘 要:针对神经网络对算力和通用性的需求进一步扩大,基于开源项目“承影”GPGPU,设计了张量处理器,可以对卷积、通用矩阵乘进行加速。首先,分析现有张量处理器设计方案及其对应算法,与直接进行卷积计算进行对比,分析性能差异。然后,提出基于三维乘法树结构的张量处理器设计,将其部署在Xilinx VCU128开发板上。在VCU128开发板上,张量处理器的工作频率为222 MHz。同时,开发了指数运算单元,辅助完成神经网络运算。在VCU128开发板上的工作频率为159 MHz。最后,利用编写汇编程序的方法,验证张量处理器的功能正确性。引入张量处理器后,预期运行时间明显减少。To meet the growing demands for computational power and versatility in neural networks,a tensor processor is designed based on the open-source project"Ventus"GPGPU.The tensor processor can accelerate convolution and general matrix multiplication operations.This study analyzes existing tensor processor design schemes and their corresponding algorithms and compares their performance differences with direct convolution calculations.Subsequently,a novel tensor processor design based on a three-dimensional multiplication tree structure is proposed.The proposed design is deployed on the Xilinx VCU128 development board.The tensor processor operates at a frequency of 222 MHz on the VCU128 development board.Additionally,an exponential operation unit is developed to aid in neural network operations.The frequency is 159 MHz on the VCU128 development board.The functionality of the tensor processor is verified using assembly language programming,and the results demonstrated a significant reduction in expected execution time after introducing the tensor processor.These findings contribute to the advancement of hardware acceleration for deep learning applications and provide a foundation for further research in this field.
关 键 词:通用图形处理器 张量处理器 卷积 通用矩阵乘 指数运算
分 类 号:TN47[电子电信—微电子学与固体电子学]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.38