大模型训练技术综述  被引量:5

A Survey on Large Model Training Technologies

在线阅读下载全文

作  者:田海东[1] 张明政 常锐[1] 童贤慧 TIAN Haidong;ZHANG Mingzheng;CHANG Rui;TONG Xianhui(ZTE Corporation,Shenzhen 518057,China)

机构地区:[1]中兴通讯股份有限公司,深圳518057

出  处:《中兴通讯技术》2024年第2期21-28,共8页ZTE Technology Journal

摘  要:实现高效训练已成为影响大模型应用普及的关键要素之一。按照数据准备、数据加载、模型初始化及评估、训练并行、模型状态保存的一般训练流程,对大模型高效训练的主要技术进行分析和论述。面对大模型规模的持续增长、数据处理类型的扩展,现有大模型训练技术仍存在较大的优化空间。认为未来大模型训练重点研究方向包括以数据为中心、数据加载智能化和异构加速、网络通信领域定制、训练并行及自动化。Achieving efficient training has become one of the key factors affecting the popularization of large model applications.The main technologies of efficient training of large models are analyzed and discussed according to the general training process of data preparation,dataloader,model initialization and evaluation,training parallelism,and model state preservation.In the face of the continuous growth of large model scale and the expansion of data processing types,there is still a large room for optimization of existing large model training tech⁃nologies.In the future,the key research directions of large model training include data-centric,intelligent dataloader and heterogeneous ac⁃celeration,customization in the field of network communication,training parallelism and automation.

关 键 词:大模型 数据准备 数据加载 模型初始化 模型评估 训练并行 训练网络 检查点 

分 类 号:TP391.1[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象