Accelerate distributed deep learning with cluster-aware sketch quantization 被引量：1

作　　者：Keshi GE Yiming ZHANG Yongquan FU Zhiquan LAI Xiaoge DENG Dongsheng LI

机构地区：[1]College of Computer,National University of Defense Technology,Changsha,410073,China

出　　处：《Science China(Information Sciences)》2023年第6期134-150,共17页中国科学（信息科学）（英文版）

基　　金：supported in part by National Natural Science Foundation of China (Grant Nos.62025208,61972409);National Key Research Development Program of China (Grant No.2021YFB0301200)。

摘　　要：Gradient quantization has been widely used in distributed training of deep neural network(DNN)models to reduce communication cost.However,existing quantization methods overlook that gradients have a nonuniform distribution changing over time,which can lead to significant compression error that not only increases the number of training iterations but also requires a higher number of quantization bits(and consequently higher delay for each iteration)to keep the validation accuracy as high as the original stochastic gradient descent(SGD)approach.To address this problem,in this paper we propose cluster-aware sketch quantization(CASQ),a novel sketch-based gradient quantization method for SGD with convergence guarantees.CASQ models the nonuniform distribution of gradients via clustering,and adaptively allocates appropriate numbers of hash buckets based on the statistics of different clusters to compress gradients.Extensive evaluation shows that compared to existing quantization methods,CASQ-based SGD(i)achieves the same validation accuracy when decreasing quantization level from 3 bits to 2 bits,and(ii)reduces the training time to convergence by up to 43%for the same training loss.

关键词：distributed training deep learning COMMUNICATION SKETCH QUANTIZATION

分类号：TP18[自动化与计算机技术—控制理论与控制工程]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

Accelerate distributed deep learning with cluster-aware sketch quantization 被引量：1

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

Accelerate distributed deep learning with cluster-aware sketch quantization 被引量：1

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索