大规模智算训练集群机房设计研究  

Research on the design of large-scale intelligent computing training cluster in data center roomse

在线阅读下载全文

作  者:赵金铭 朱丽 姜宇光 孙立峰 刘恋 吴志昂 ZHAO Jin-ming;ZHU Li;JIANG Yu-guang;SUN Li-feng;LIU Lian;WU Zhi-ang(China Mobile Group Design Institute Co.,Ltd.,Beijing 100080,China)

机构地区:[1]中国移动通信集团设计院有限公司,北京100080

出  处:《电信工程技术与标准化》2024年第S02期45-49,共5页Telecom Engineering Technics and Standardization

摘  要:大规模智算训练集群采用无损网络架构,通过参数面和样本面网络组建大规模低时延集群,为智算业务提供集群算力。智算业务集群化特点与传统数据中心机房设计差异性很大,如何进行智算中心机房设计和布局、提升智算中心机房与业务匹配性成为智算机房建设的核心。本文通过组网设备配置数量规律分析,结合机房供电、制冷和面积等限制条件进行模块级、机房级和楼栋级机架布置分析,提高机房设施规划与智算业务需求的匹配性。The large-scale intelligent computing training cluster adopts a lossless network architecture,and constructs a large-scale low latency cluster through parameter and sample networks to provide intelligent cluster computing services.The clustering characteristics of intelligent computing center rooms differ greatly from the design of traditional data center rooms.How to design and layout intelligent computing center rooms and improve their compatibility with application has become the core of intelligent computing room construction.This article analyzes the patern of configuration quantity of networking equipment,combined with the limitations of power supply,cooling,and area in the computer room,to analyze the racks layout of module level,computer room level,and building level,in order to improve the matching between computer room facility planning and intelligent computing application needs.

关 键 词:智算集群 无损网络 机架需求 机架布局 

分 类 号:TU248.7[建筑科学—建筑设计及理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象