基于Adam局部优化的分布式近似牛顿深度学习模型训练  被引量:16

DEEP LEARNING TRAINING VIA DISTRIBUTED APPROXIMATENEWTON-TYPE METHOD BASED ON ADAM LOCAL OPTIMIZATION

在线阅读下载全文

作  者:毕常遥 袁晓彤[1,2] Bi Changyao;Yuan Xiaotong(School of Automation,Nanjing University of Information Science and Technology,Nanjing 210044,Jiangsu,China;Jiangsu Key Laboratory of Big Data Analysis Technology,Nanjing 210044,Jiangsu,China)

机构地区:[1]南京信息工程大学自动化学院,江苏南京210044 [2]江苏省大数据分析技术重点实验室,江苏南京210044

出  处:《计算机应用与软件》2021年第10期278-283,共6页Computer Applications and Software

基  金:国家自然科学基金项目(61876090,61936005);国家新一代人工智能重大项目(2018AAA0100401)。

摘  要:分布式学习是减轻现代机器学习系统中不断增加的数据和模型规模压力的有效工具之一。DANE算法是一种近似牛顿方法,已被广泛应用于通信高效的分布式机器学习。其具有收敛速度快且无须计算Hessian矩阵逆的优点,从而可以在高维情况下显著减少通信和计算开销。为了进一步提高计算效率,就需要研究如何加快DANE的局部优化。选择使用最流行的自适应梯度优化算法Adam取代常用的随机梯度下降法来求解DANE的局部单机子优化问题是一种可行的方法。实验表明,基于Adam的优化在收敛速度上可以明显快于原始的基于SGD的实现,且几乎不会牺牲模型泛化性能。Distributed learning is one of the promising tools for alleviating the pressure of ever increasing data and model scale in modern machine learning systems.The DANE algorithm is an approximate Newton method popularly used for communication-efficient distributed machine learning.DANE has the advantage of exhibiting sharp convergence behavior and no need calculating the inverse of the Hessian matrix,which can significantly reduce communication and computational costs in high dimensional settings.In order to further improve the computational efficiency,it is necessary to study the problem of accelerating the local optimization of DANE.Using Adam,which is one of the most popular adaptive gradient optimization algorithms,to replace the stochastic gradient descent method conventionally used by DANE for solving local sub-problems could be a feasible method.The experimental results show that the proposed Adam-based optimization can be significantly faster than the original SGD-based implementation in convergence speed at almost no sacrifice in model generalization performance.

关 键 词:深度学习 近似牛顿法 分布式优化 Adam算法 随机抽样 

分 类 号:TP181[自动化与计算机技术—控制理论与控制工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象