检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:王晓晓 朱晓娟[1] WANG Xiaoxiao;ZHU Xiaojuan(School of Computer Science and Engineering,Anhui University of Science&Technology,Huainan 232001,China)
机构地区:[1]安徽理工大学计算机科学与工程学院,安徽淮南232001
出 处:《湖北民族大学学报(自然科学版)》2025年第1期34-40,共7页Journal of Hubei Minzu University:Natural Science Edition
基 金:安徽省高校省级自然科学研究重点项目(KJ2020A0300)。
摘 要:针对分布式机器学习场景中,多个计算节点和参数服务器节点之间频繁传输参数和梯度导致通信开销较大、模型训练效率较低的问题,提出基于自适应分层梯度压缩(adaptive layered gradient compression, ALGC)的通信优化方法。首先,为每层神经网络设置1个合适的压缩阈值,选择性地压缩大于该阈值的层;其次,为被选择压缩的每层单独设定稀疏阈值,并动态调整该阈值,实现对每层梯度传输的自适应压缩;最后,将计算与通信重叠,利用参数服务器汇总每层的梯度和梯度残差完成对全局模型的更新。结果表明,ALGC方法的训练准确率最高可达95.07%,并且实现了最短收敛时间和最大加速比。ALGC方法在保证模型训练准确率的同时,对于提升模型训练速度和降低通信开销具有重要作用。To address the issues in the context of distributed machine learning of high communication overhead and low model training efficiency caused by frequent transmission of parameters and gradients between multiple computing nodes and parameter server nodes,a communication optimization method based on adaptive layered gradient compression(ALGC)was proposed.Firstly,an appropriate compression threshold was set for each layer of the neural network,and layers exceeding this threshold were selectively compressed.Secondly,a sparse threshold was separately set for each layer selected for compression and dynamically adjusted to achieve adaptive compression of gradient transmission for each layer.Finally,computation and communication were overlapped,and the parameter server aggregates the gradients and gradient residuals of each layer to update the global model.The results showed that the training accuracy of the ALGC method could reach up to 95.07%,and it achieved the minimum convergence time and the maximum speedup ratio.The ALGC method played a significant role in improving the model training speed and reducing communication overhead while ensuring the model training accuracy.
关 键 词:分布式机器学习 梯度压缩 参数服务器 稀疏化 通信优化
分 类 号:TP183[自动化与计算机技术—控制理论与控制工程]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:18.224.96.245