检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:高远翔 罗龙[1] 孙罡[1] GAO Yuanxiang;LUO Long;SUN Gang(Key Lab of Optical Fiber Sensing and Communications,University of Electronic Science and Technology of China,Chengdu 611731)
机构地区:[1]电子科技大学光纤传感与通信教育部重点实验室,成都611731
出 处:《电子科技大学学报》2022年第2期200-206,共7页Journal of University of Electronic Science and Technology of China
基 金:国家重点研发计划(2019YFB1802800)。
摘 要:多阶段网络被广泛应用于机器学习集群,由于多阶段网络中可用路径多,分组的路由是一个组合优化难题。现有基于启发式的路由算法由于缺乏性能保证,严重影响分组传输延迟。提出了基于强化学习的多阶段网络分组路由方法,使用一个新颖的策略迭代算法,通过学习的方式计算出最佳路由策略。算法通过在策略评估步骤中使用价值函数的最大似然估计器,克服了强化学习方法中蒙泰卡罗(MC)或时间差分(TD)价值估计器样本效率低的问题。为了应对组合优化时计算复杂度高的问题,算法在策略改进步骤中将组合动作空间上的优化分解为各组成动作的序列优化,以提高求解效率。基于NS-3网络模拟器的仿真实验结果表明,相较于现有最优的启发式路由策略,该算法学习到的路由策略降低了13.9%的平均分组延迟。Multi-stage networks are widely used in machine learning clusters.Due to the large number of available paths in a multi-stage network,packet routing is a combinatorial optimization problem.Existing routing algorithms based on heuristics lack performance guarantee,which seriously affects the packet transmission delay.This paper proposes a packet routing method based on reinforcement learning for multi-stage networks,using a novel policy iteration algorithm to compute an optimal routing policy by learning.In the policy evaluation step,this algorithm uses the maximum likelihood estimator of the value function,which overcomes the low sample efficiency problem of Monte Carlo(MC)or Temporal-Difference(TD)value function estimators in reinforcement learning.To deal with the high computational complexity of the combinatorial optimization problem in the policy improvement step,this algorithm decomposes the optimization over a combinatorial action space into a sequential optimization of each action.Experiments based on NS-3 network simulator show that the routing policy learnt by the algorithm reduces 13.9%of the average packet transmission delay compared to existing best routing heuristics.
分 类 号:TN915[电子电信—通信与信息系统]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:18.116.242.144