基于GPU服务器的分布式机器学习通信性能提升  

Improvement of Distributed Machine Learning Communication Performance Based on GPU Server

在线阅读下载全文

作  者:范亚娜 李媛 翟斌 Fan Ya-na;Li Yuan;Zhai Bin(Beijing Guodiantong Network Technology Co.,Ltd.,Beijing 100070,China)

机构地区:[1]北京国电通网络技术有限公司,北京100070

出  处:《科学与信息化》2024年第17期46-48,共3页Technology and Information

摘  要:GPU虚拟化技术推动着云服务器业态演进,分布式机器学习在本地完成优化训练,由通信链路聚合结果数据,开始下一轮训练迭代。本文通过对分布式机器学习功能进行模块化划分,明确通信性能是制约其算力的关键。先对比分层同步算法与平面同步算法的通信性能,再以全局同步时间GST为表征参数,对比不同通信算法的优缺点、布置难度和适用场合。GPU virtualization technology promotes the evolution of cloud server industry,distributed machine learning completes optimization training locally,aggregates the result data through the communication link,and starts the next round of training iteration.In this paper,it is clarified that communication performance is the key to restricting the computing power through the modular division of distributed machine learning functions.Firstly,the communication performance of the hierarchical synchronization algorithm and the planar synchronization algorithm is compared,and then the global synchronization time GST is used as the characterization parameter to compare the advantages and disadvantages,layout difficulty and application occasions of different communication algorithms.

关 键 词:GPU 服务器 机器学习 通信频率 

分 类 号:TP3[自动化与计算机技术—计算机科学与技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象