检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:李婧 侯诗琪 LI Jing;HOU Shiqi(College of Computer Science and Technology,Shanghai University of Electric Power,Shanghai 201306,China)
机构地区:[1]上海电力大学计算机科学与技术学院,上海201306
出 处:《中国工程机械学报》2022年第4期288-293,共6页Chinese Journal of Construction Machinery
基 金:国家自然科学基金资助项目(61872230,61572311)。
摘 要:在大流量传输场景中,传统启发式路由选择协议无法根据网络状态动态调整路由策略,而基于数据驱动的路由协议在训练初期无法保证网络吞吐量。针对此问题,提出基于先验知识指导的安全强化学习路由算法,把先验知识引入深度强化学习模型的动作选择,结合ε-greedy策略,根据网络状态对下一跳进行评估和约束,必要时提供更优动作,避免无效动作。基于Keras与Networkx的仿真实验表明:该算法可使网络保持较高的吞吐量,网络性能波动可稳定在较小的范围内,模型收敛速度显著提升。In large traffic transmission scenarios,the traditional heuristic routing protocols can’t dynamically adjust the routing strategy according to network state,while the data driven routing protocols are not able to ensure network throughput during the initial training stage.Aiming at this problem,a safe reinforcement learning routing algorithm with priori knowledge guidance is proposed,which introduces priori knowledge into deep reinforcement learning model,evaluates and restricts the next hop action selection based on the network situation combining withε-greedy strategy,and provides better action when it is necessary to avoid invalid action selection.Simulation experiments based on Keras and Networkx demonstrate that the algorithm can make network maintain high throughput and keep performance fluctuation in a small range,and the convergence speed of the model is significantly improved.
关 键 词:先验知识 深度强化学习 路由选择 智能路由 吞吐量
分 类 号:TP393.1[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.28