非光滑强凸情形Adam型算法的最优收敛速率  被引量:3

The Optimal Convergence Rate of Adam-Type Algorithms for Non-Smooth Strongly Convex Cases

在线阅读下载全文

作  者:陇盛 陶蔚 张泽东 陶卿 LONG Sheng;TAO Wei;ZHANG Ze-dong;TAO Qing(Science and Technology on Information Systems Engineering Laboratory,National University of Defense Technology,Changsha,Hunan 410073,China;Center for Strategic Assessment and Consulting,Academy of Military Science,Beijing 100091,China;Department of Information Engineering,Army Academy of Artillery and Air Defense,Hefei,Anhui 230031,China)

机构地区:[1]国防科技大学信息系统工程重点实验室,湖南长沙410073 [2]军事科学院战略评估咨询中心,北京100091 [3]陆军炮兵防空兵学院信息工程系,安徽合肥230031

出  处:《电子学报》2022年第9期2049-2059,共11页Acta Electronica Sinica

基  金:国家自然科学基金(No.62076252,No.62106281)。

摘  要:对于非光滑强凸问题,在线梯度下降(Online Gradient Decent,OGD)取适当步长参数可以得到对数阶后悔界.然而,这并不能使一阶随机优化算法达到最优收敛速率.为解决这一问题,研究者通常采取两种方案:其一是改进算法本身,另一种是修改算法输出方式.典型的Adam(Adaptive moment estimation)型算法SAdam(Strongly convex Adaptive moment esti⁃mation)采用了改进算法的方式,并添加了自适应步长策略和动量技巧,虽然得到更好的数据依赖的后悔界,但在随机情形仍然达不到最优.针对这个问题,本文改用加权平均的算法输出方式,并且重新设计与以往算法同阶的步长超参数,提出了一种名为WSAdam(Weighted average Strongly convex Adaptive moment estimation)的Adam型算法.证明了WSAdam达到了非光滑强凸问题的最优收敛速率.经过Reddi问题的测试和在非光滑强凸函数优化中的实验,验证了所提方法的有效性.For non-smooth strongly convex problems,the logarithmic regret bound can be obtained by online gradi⁃ent descent with the appropriate step size parameter.However,it cannot make the first order stochastic optimization algo⁃rithm achieve the optimal convergence rate.To solve this problem,researchers usually adopt two schemes:one is to modify the iterate of the algorithm,the other is to change the output mode of the algorithm.SAdam,a typical Adam-type algorithm,modify the algorithm by adding adaptive step size strategy and momentum technique.Although it obtains a tighter data de⁃pendent regret bound,it still cannot reach the optimal bound in the stochastic case.To solve this problem,this paper rede⁃signs the step size scheme which is the same order as the previous algorithm,uses the weighted average algorithm output mode,and proposes an Adam-type algorithm named WSAdam.It is proved that WSAdam achieves the optimal conver⁃gence rate for non-smooth strongly convex problems.The validity of the proposed method is verified by the test of Reddi's problem and the experiment in the optimization of non-smooth strongly convex functions.

关 键 词:非光滑 强凸优化 自适应步长 动量方法 Adam型算法 加权平均 收敛速率 

分 类 号:TP301[自动化与计算机技术—计算机系统结构]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象