基于虚拟化的GPU异构资源池平台架构设计、关键技术及应用研究被引量：1

Architecture design,key technologies,and application research of GPU heterogeneous resource pool platform based on virtualization

作　　者：张万才张楠杨文清王涛张文强 ZHANG Wancai;ZHANG Nan;YANG Wenqing;WANG Tao;ZHANG Wenqiang(Nari Technology Development Co.,Ltd.,Nanjing 211100,China)

机构地区：[1]国电南瑞科技股份有限公司,江苏南京211100

出　　处：《电信科学》2024年第9期162-175,共14页Telecommunications Science

基　　金：国家电网公司科技项目(No.524608210272)。

摘　　要：人工智能算力资源面临价格高昂、市场断供等现状问题,传统的单卡单用模式导致资源利用率和使用效率低下,现有的技术研究手段难以支撑多元异构图形处理单元(graphics processing unit,GPU)资源的高效管理和调度。基于此,提出一种基于虚拟化的GPU异构资源池平台,首先对平台总体架构、逻辑架构和功能架构进行了规划设计;其次,对关键技术进行研究,提出了虚拟化异构GPU资源池框架和基于时间切片+负载均衡的调度模型;最后,基于所提方法,提出了多业务单卡叠加、交叉拉远、跨机整合、混合部署和时分复用等多种创新应用模式。所提方法为企业级AI应用提供了可兼容多个GPU不同厂商、支持远程访问、可灵活切分和聚合、可弹性调度的GPU算力资源。经测算分析,同等开发和训练量下,GPU卡数量可节省60%、运行效率可提升4倍。The current challenges facing the field of artificial intelligence include high prices and market supply disruptions.The traditional single-card,single-use model results in low resource utilization and efficiency.Furthermore,existing technological research methods make it difficult to support the efficient management and scheduling of diverse heterogeneous GPU resources.Based on this,a virtualization-based GPU heterogeneous resource pool platform was proposed.Firstly,the overall architecture,logical architecture,and functional architecture of the platform were planned and designed.Secondly,key technologies were studied,and a virtualization heterogeneous GPU resource pool framework and a scheduling model based on time slicing+load balancing were proposed.Finally,based on the methods described,various innovative application models were proposed,including multiservice single-card stacking,cross-pull,cross-machine integration,hybrid deployment,and time division multiplexing.The research method proposed provides enterprise-level AI applications with GPU computing resources that are compatible with multiple GPU manufacturers,support remote access,flexible partitioning and aggregation,and flexible scheduling.Following the completion of calculations and an in-depth analysis,it has been demonstrated that a reduction of up to 60%in the number of GPU cards can be achieved while simultaneously enhancing operational efficiency by a factor of four.

关键词：GPU异构资源池算力平台虚拟化时间切片负载均衡

分类号：TP391[自动化与计算机技术—计算机应用技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于虚拟化的GPU异构资源池平台架构设计、关键技术及应用研究被引量：1

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于虚拟化的GPU异构资源池平台架构设计、关键技术及应用研究 被引量：1

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

基于虚拟化的GPU异构资源池平台架构设计、关键技术及应用研究被引量：1