基于存算一体集成芯片的大语言模型专用硬件架构  被引量:2

Large Language Model Specific Hardware Architecture Based on Integrated Compute-in-Memory Chips

在线阅读下载全文

作  者:何斯琪 穆琛 陈迟晓[1] HE Siqi;MU Chen;CHEN Chixiao(Fudan University,Shanghai 200433,China)

机构地区:[1]复旦大学,上海200433

出  处:《中兴通讯技术》2024年第2期37-42,共6页ZTE Technology Journal

基  金:国家自然科学基金项目(62322404);复旦大学-中兴通讯强计算架构研究联合实验室“存算一体架构研究项目”。

摘  要:目前以ChatGPT为代表的人工智能(AI)大模型在参数规模和系统算力需求上呈现指数级的增长趋势。深入研究了大型模型专用硬件架构,详细分析了大模型在部署过程中面临的带宽问题,以及这些问题对当前数据中心的重大影响。提出采用存算一体集成芯片架构的解决方案,旨在缓解数据传输压力,同时提高大模型推理的能量效率。此外,还深入研究了在存算一体架构下轻量化-存内压缩协同设计的可能性,以实现稀疏网络在存算一体硬件上的稠密映射,从而显著提高存储密度和计算能效。Artificial intelligent(AI)models represented by ChatGPT are showing an exponential growth trend in parameter size and system com⁃puting power requirements.The dedicated hardware architecture for large models is studied,and a detailed analysis of the bandwidth bottle⁃neck issues faced by large models during deployment is provided,as well as the significant impact of this challenge on current data centers.To address this issue,a solution of using integrated compute-in-memory chiplets has been proposed,aiming to alleviate data transmission pres⁃sure and improve the energy efficiency of large-scale model inference.In addition,the possibility of lightweight in-memory compression col⁃laborative design under the in-memory computing architecture is studied,in order to achieve dense mapping of sparse networks on the inte⁃grated in-memory computing architecture hardware,thereby significantly improving storage density and computational energy efficiency.

关 键 词:大语言模型 存算一体 集成芯粒 存内压缩 

分 类 号:TP309.2[自动化与计算机技术—计算机系统结构] TP391.44[自动化与计算机技术—计算机科学与技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象