检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
机构地区:[1]LMAM and School of Mathematical Sciences,Peking University,Beijing 100871,China
出 处:《Numerical Mathematics(Theory,Methods and Applications)》2023年第4期914-930,共17页高等学校计算数学学报(英文版)
基 金:supported by the NSFC(Grant No.11825102);the China Postdoctoral Science Foundation(Grant No.2023M730093);the National Key R&D Program of China(Grant No.2021YFA1003300).
摘 要:In this paper,we first reinvestigate the convergence of the vanilla SGD method in the sense of L2 under more general learning rates conditions and a more general convex assumption,which relieves the conditions on learning rates and does not need the problem to be strongly convex.Then,by taking advantage of the Lyapunov function technique,we present the convergence of the momentum SGD and Nesterov accelerated SGDmethods for the convex and non-convex problem under L-smooth assumption that extends the bounded gradient limitation to a certain extent.The convergence of time averaged SGD was also analyzed.
关 键 词:SGD momentum SGD Nesterov acceleration time averaged SGD convergence analysis non-convex
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.117