检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:黄万伟 郑向雨 张超钦 王苏南[3] 张校辉 HUANG Wanwei;ZHENG Xiangyu;ZHANG Chaoqin;WANG Sunan;ZHANG Xiaohui(College of Software Engineering,Zhengzhou University of Light Industry,Zhengzhou 450001,China;College of Computer and Communication Engineering,Zhengzhou University of Light Industry,Zhengzhou 450001,China;School of Electronic and Commu-nication Engineering,Shenzhen Polytechnic,Shenzhen 518055,China;Henan Xin′an Communication Technology Co.,Ltd.,Zheng-zhou 450001,China)
机构地区:[1]郑州轻工业大学软件学院,河南郑州450001 [2]郑州轻工业大学计算机与通信工程学院,河南郑州450001 [3]深圳职业技术学院电子与通信工程学院,广东深圳518055 [4]河南信安通信技术股份有限公司,河南郑州450001
出 处:《郑州大学学报(工学版)》2023年第1期44-51,共8页Journal of Zhengzhou University(Engineering Science)
基 金:国家自然科学基金资助项目(62002382,62072416);河南省重点研发与推广专项(科技攻关)(222102210175,222102210111);2022年河南省专业学位研究生精品教学案例项目(YJS2022AL035)。
摘 要:针对现有智能路由算法收敛速度慢、平均时延高、带宽利用率低等问题,提出了一种基于深度强化学习(DRL)的多路径智能路由算法RDPG-Route。该算法采用循环确定性策略梯度(RDPG)作为训练框架,引入长短期记忆网络(LSTM)作为神经网络,基于RDPG处理高纬度问题的算法优势,以及LSTM循环核中记忆体的存储能力,将动态变化的网络状态输入神经网络进行训练。算法训练收敛后,将神经网络输出的动作值作为网络链路权重,基于多路径路由策略进行流量划分,以实现网络路由的智能动态调整。最后,将RDPG-Route路由算法分别与ECMP、DRL-TE和DRL-R-DDPG路由算法进行对比。结果表明,RDPG-Route具有较好的收敛性和有效性,相比于其他智能路由算法至少降低了7.2%平均端到端时延,提高了6.5%吞吐量,减少了8.9%丢包率和6.3%的最大链路利用率。To solve the problems of slow convergence speed,high average delay,and low bandwidth utilization of existing intelligent routing algorithms,in this study,a multi-path intelligent routing algorithm RDPG-Route based on deep reinforcement learning(DRL)was proposed.In the algorithm,the recurrent determi-nistic policy gradient(RDPG)was used as the training framework,the long short-term memory(LSTM)was introduced as the neural network.The algorithm advantages of RDPG were used to handle high-latitude problems and the storage capacity of the memory in the LSTM loop core,the dynamically changing network state could be input to the neural network for training.After the algorithm training converged,the action value output by the neural network was used as the net-work link weight,and the traffic was divided based on the multi-path routing strategy to realize the intelligent dy-namic adjustment of the network routing.Finally,RDPG-Route routing algorithm was compared with ECMP,DRL-TE,and DRL-R-DDPG routing algorithms respectively.The results indicated that RDPG-Route had better conver-gence and effectiveness.Compared with other optimal intelligent routing algorithm,RDPG-Route could reduce the average end-to-end delay by at least 7.2%,improve the throughput by 6.5%,and reduce the packet loss rate by 8.9%and the maximum link utilization rate by 6.3%.
关 键 词:体验质量 软件定义网络 深度强化学习 路由算法 循环确定性策略梯度
分 类 号:TP393[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.28