检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:史宏志[1,2] 赵健 赵雅倩[1,2] 李茹杨 魏辉[1,2] 胡克坤 温东超 金良 Shi Hongzhi;Zhao Jian;Zhao Yaqian;Li Ruyang;Wei Hui;Hu Kekun;Wen Dongchao;Jin Liang(IEIT SYSTEMS Co.,Ltd.,Jinan 250101;Inspur(Beijing)Electronic Information Industry Co.,Ltd.,Beijing 100095)
机构地区:[1]浪潮电子信息产业股份有限公司,济南250101 [2]浪潮(北京)电子信息产业有限公司,北京100095
出 处:《计算机研究与发展》2025年第5期1164-1189,共26页Journal of Computer Research and Development
基 金:山东省自然科学基金项目(ZR2020QF035)。
摘 要:近年来,大模型推动自然语言处理、机器视觉等众多领域取得前所未有的进展.混合专家(mixture of experts,MoE)凭借在模型参数扩展、计算成本控制和复杂任务处理等方面的独特优势成为大模型的主流架构之一.然而,随着参数规模的持续增长,系统的执行效率和可扩展能力愈发难以满足需求,亟待解决.系统优化方法是解决这一挑战的有效途径,日益成为研究热点.故综述大模型时代MoE系统优化技术的研究现状,首先介绍MoE大模型的发展现状,并分析其在系统端面临的性能瓶颈;然后从内存占用、通信延迟、计算效率和并行扩展4个系统核心维度对最新的研究进展进行全面梳理和深入分析,并对其中涉及的关键技术、适用场景和待优化方向进行详细对比阐述;最后总结MoE系统优化的研究现状,并展望未来研究方向.In recent years,large models have made unprecedented progresses in variety of domains,such as natural language processing and machine vision.Mixture of experts(MoE)has emerged as one of the most popular architectures for large models due to its distinct advantages in model parameter scalability,computational cost control and complex task processing.However,with the continuous increase of the parameter scale,the execution efficiency and scalability of the system are becoming increasingly challenging to meet the demand,and must be addressed urgently.The system optimization approach is an effective solution to solve this problem,which has become a hot research area.In light of this,we review the present research status of MoE system optimization techniques in the era of large model in this paper.To begin,we describe the present development state of work for MoE large model,and analyze the performance bottlenecks it faces on the system side.Then,we comprehensively sort out and deeply analyze the most recent research progress from four system core dimensions,ranging from memory occupation,communication latency,computational efficiency to parallel scaling,and compare and elaborate on the key technologies,application scenarios and optimization directions;finally,we summarize the current research state of MoE system optimization and outline some future research directions as well.
关 键 词:大模型 混合专家 内存卸载 分层通信 专家放置 专家激活预测 自适应并行
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.147