基于Gossip的异步分布式训练算法  被引量:1

Asynchronous Distributed Training Algorithm based on Gossip

在线阅读下载全文

作  者:周嘉 涂军[1] 任冬淋 ZHOU Jia;TU Jun;REN Donglin(School of Computer Science,Hubei Univ.of Tech.,Wuhan 430068,China)

机构地区:[1]湖北工业大学计算机学院,湖北武汉430068

出  处:《湖北工业大学学报》2023年第1期43-46,58,共5页Journal of Hubei University of Technology

摘  要:因此基于Gossip协议并结合SGD(Stochastic Gradient Descent)提出了一种用于深度学习的通信框架GR-SGD(Gossip Ring SGD),该通信框架是非中心化且异步的,解决了通信等待时间较长的问题。实验使用ImageNet数据集,ResNet模型验证了该算法的可行并与Ring AllReduce和D-PSGD(Decentralized parallel SGD)进行了比较,GR-SGD在更短的时间内完成了训练。Ring AllReduce algorithm,one of the existing decentralized distributed clusters,can reduce the bottleneck of the central node communication.However,the communication algorithm is synchronous,which will lead to longer communication waiting time inter-node in the cluster.Combined the Gossip protocol with Stochastic Gradient Descent(SGD),this paper proposes a communication framework Gossip Ring SGD(GR-SGD)for deep learning.GR-SGD is decentralized and asynchronous,and solves the problem of long communication waiting time.This paper uses the ImageNet data set and the ResNet model to verify the feasibility of GR-SGD and compares it with Ring AllReduce and D-PSGD,and it turns out that GR-SGD finishes the training in shorter time.

关 键 词:非中心化分布式 GOSSIP 异步 

分 类 号:TP399[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象