检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:韦洪旭 陇盛 陶蔚 陶卿 WEI Hongxu;LONG Sheng;TAO Wei;TAO Qing(Department of Information Engineering,Army Academy of Artillery and Air Defense of PLA,Hefei 230031,China;Laboratory for Big Data and Decision,College of Systems Engineering,National University of Defense Technology,Changsha 410073,China;Institute of Evaluation and Assessment Research,PLA Academy of Military Science,Beijing 100091,China)
机构地区:[1]中国人民解放军陆军炮兵防空兵学院信息工程系,合肥230031 [2]国防科技大学系统工程学院大数据与决策实验室,长沙410073 [3]中国人民解放军军事科学院评估论证研究中心,北京100091
出 处:《计算机科学》2023年第11期220-226,共7页Computer Science
摘 要:自适应策略与动量法是提升优化算法性能的常用方法。目前自适应梯度方法大多采用AdaGrad型策略,但该策略在约束优化中效果不佳,为此,研究人员提出了更适用于处理约束问题的AdaGrad+方法,但其与SGD一样在非光滑凸情形下未达到最优个体收敛速率,结合NAG动量也并未达到预期的加速效果。针对上述问题,文中将AdaGrad+调整步长的策略与Heavy-Ball型动量法加速收敛的优点相结合,提出了一种基于AdaGrad+的自适应动量法。通过设置加权动量项、巧妙选取时变参数和灵活处理自适应矩阵,证明了该方法对于非光滑一般凸问题具有最优个体收敛速率。最后在l∞∞范数约束下,通过求解典型的hinge损失函数优化问题验证了理论分析的正确性,通过深度卷积神经网络训练实验验证了该方法在实际应用中也具有良好性能。Adaptive strategies and momentum methods are commonly used to improve the performance of optimization algorithms.Most of the adaptive gradient methods use the AdaGrad-type strategy at present.The AdaGrad+method,which is more suitable for dealing with constrained problems,is proposed to solve the inefficiency of AdaGrad-type strategy on constrained optimization.But it is the same as SGD in non-smooth convex situations.The optimal individual convergence rate is not reached.Combining the strategy with NAG momentum but fail to achieve the expected acceleration effect.Aiming at the above problems,this paper proposes an adaptive momentum method based on AdaGrad+.The method uses the strategy of AdaGrad+to adjust the step size,and inherits the advantages of the Heavy-Ball momentum method to accelerate the convergence.It is proved that the method achieves the optimal individual convergence rate for non-smooth convex problems by setting the weighted momentum term,selecting time-varying parameters skillfully and processing the adaptive matrix flexibly.Finally,experiments are conducted on the typical optimization problem of hinge loss function with l∞norm constraint,and the experiment results verify the correctness of the theoretical analysis.In addition,the deep learning experiments confirm that the proposed method also has good performance in practical applications.
关 键 词:凸优化 自适应策略 AdaGrad+ Heavy-Ball动量方法 收敛速率
分 类 号:TP181[自动化与计算机技术—控制理论与控制工程]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.249