大模型时代的混合专家系统优化综述  

Survey on System Optimization for Mixture of Experts in the Era of Large Models

在线阅读下载全文

作  者:史宏志[1,2] 赵健 赵雅倩[1,2] 李茹杨 魏辉[1,2] 胡克坤 温东超 金良 Shi Hongzhi;Zhao Jian;Zhao Yaqian;Li Ruyang;Wei Hui;Hu Kekun;Wen Dongchao;Jin Liang(IEIT SYSTEMS Co.,Ltd.,Jinan 250101;Inspur(Beijing)Electronic Information Industry Co.,Ltd.,Beijing 100095)

机构地区:[1]浪潮电子信息产业股份有限公司,济南250101 [2]浪潮(北京)电子信息产业有限公司,北京100095

出  处:《计算机研究与发展》2025年第5期1164-1189,共26页Journal of Computer Research and Development

基  金:山东省自然科学基金项目(ZR2020QF035)。

摘  要:近年来,大模型推动自然语言处理、机器视觉等众多领域取得前所未有的进展.混合专家(mixture of experts,MoE)凭借在模型参数扩展、计算成本控制和复杂任务处理等方面的独特优势成为大模型的主流架构之一.然而,随着参数规模的持续增长,系统的执行效率和可扩展能力愈发难以满足需求,亟待解决.系统优化方法是解决这一挑战的有效途径,日益成为研究热点.故综述大模型时代MoE系统优化技术的研究现状,首先介绍MoE大模型的发展现状,并分析其在系统端面临的性能瓶颈;然后从内存占用、通信延迟、计算效率和并行扩展4个系统核心维度对最新的研究进展进行全面梳理和深入分析,并对其中涉及的关键技术、适用场景和待优化方向进行详细对比阐述;最后总结MoE系统优化的研究现状,并展望未来研究方向.In recent years,large models have made unprecedented progresses in variety of domains,such as natural language processing and machine vision.Mixture of experts(MoE)has emerged as one of the most popular architectures for large models due to its distinct advantages in model parameter scalability,computational cost control and complex task processing.However,with the continuous increase of the parameter scale,the execution efficiency and scalability of the system are becoming increasingly challenging to meet the demand,and must be addressed urgently.The system optimization approach is an effective solution to solve this problem,which has become a hot research area.In light of this,we review the present research status of MoE system optimization techniques in the era of large model in this paper.To begin,we describe the present development state of work for MoE large model,and analyze the performance bottlenecks it faces on the system side.Then,we comprehensively sort out and deeply analyze the most recent research progress from four system core dimensions,ranging from memory occupation,communication latency,computational efficiency to parallel scaling,and compare and elaborate on the key technologies,application scenarios and optimization directions;finally,we summarize the current research state of MoE system optimization and outline some future research directions as well.

关 键 词:大模型 混合专家 内存卸载 分层通信 专家放置 专家激活预测 自适应并行 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象