基于深度强化学习的电子政务云动态化任务调度方法  

Scheduling of dynamic tasks in e-government clouds usingdeep reinforcement learning

在线阅读下载全文

作  者:龙宇杰 修熙 黄庆 黄晓勉 李莹 吴维刚[1] Long Yujie;Xiu Xi;Huang Qing;Huang Xiaomian;Li Ying;Wu Weigang(School of Computer Science&Engineering,Sun Yat-Sen University,Guangzhou 510006,China;Guangzhou Digital Government Operations Center,Guangzhou 510635,China;Guangdong Yixun Technology Co.,Ltd.,Guangzhou 510635,China;Bingo Software Co.,Ltd.,Guangzhou 510663,China)

机构地区:[1]中山大学计算机学院,广州510006 [2]广州市数字政府运营中心,广州510635 [3]广东亿迅科技有限公司,广州510635 [4]广州市品高软件股份有限公司,广州510663

出  处:《计算机应用研究》2024年第6期1797-1802,共6页Application Research of Computers

摘  要:电子政务云中心的任务调度一直是个复杂的问题。大多数现有的任务调度方法依赖于专家知识,通用性不强,无法处理动态的云环境,通常会导致云中心的资源利用率降低和服务质量下降,任务的完工时间变长。为此,提出了一种基于演员评论家(actor-critic,A2C)算法的深度强化学习调度方法。首先,actor网络参数化策略根据当前系统状态选择调度动作,同时critic网络对当前系统状态给出评分;然后,使用梯度上升的方式来更新actor策略网络,其中使用了critic网络的评分来计算动作的优劣;最后,使用了两个真实的业务数据集进行模拟实验。结果显示,与经典的策略梯度算法以及五个启发式任务调度方法相比,该方法可以提高云数据中心的资源利用率并缩短离线任务的完工时间,能更好地适应动态的电子政务云环境。The task scheduling of e-government cloud center has always been a complex problem.Most existing task scheduling solutions rely on expert knowledge and are not versatile enough to deal with dynamic cloud environment,which often leads to low resource utilization and degradation of quality-of-service,resulting in longer makespan.To address this issue,this paper proposed a deep reinforcement learning(DRL)scheduling algorithm based on the actor-critic(A2C)mechanism.Firstly,the actor network parameterized the policy and chose scheduling actions based on the current system state,while the critic network assigned scores to the current system state.Then,it updated the actor policy network using gradient ascent,utilizing the scores from the critic network to determine the effectiveness of actions.Finally,it conducted simulation experiments using real data from production datacenters.The results show that this method can improve resource utilization in cloud datacenters and reduce the makespan in comparison to the classic policy gradient algorithm and five commonly used heuristic task scheduling methods.This evidence suggests that the proposed method is superiorly adapted for the dynamic e-government clouds.

关 键 词:电子政务 云计算 任务调度 深度强化学习 演员评论家算法 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象