边云环境中基于深度强化学习的任务卸载和资源分配方法  

Deep reinforcement learning based task offloading decision and resource allocation method in edge and cloud environments

在线阅读下载全文

作  者:何达航 王昱 左利云 He Dahang;Wang Yu;Zuo Liyun(College of Computer,Guangdong University of Petrochemical Technology,Maoming Guangdong 525000,China;Network&Education Information Technology Center,Guangdong University of Petrochemical Technology,Maoming Guangdong 525000,China)

机构地区:[1]广东石油化工学院计算机学院,广东茂名525000 [2]广东石油化工学院网络与教育信息技术中心,广东茂名525000

出  处:《计算机应用研究》2025年第2期486-493,共8页Application Research of Computers

基  金:广东省自然科学基金资助项目(2024A1515010144);广东省普通高校重点领域专项(2023ZDZX3013);茂名绿色化工研究院扬帆计划资助项目(MMGCIRI-2022YFJH-Y-012)。

摘  要:边缘计算允许物联网设备卸载任务到边云环境中执行,以满足任务对资源的需求。由于边云环境的高度随机性和动态性,启发式算法和基于Q表格的强化学习算法难以实现异构任务的高效卸载决策,所以提出了一个新颖的竞争和双深度Q网络(novel dueling and double deep Q network,ND3QN)的深度强化学习算法,用于任务高效卸载和资源分配。ND3QN联合优化任务完成时间和费用,并创新地构建了包含环境动态信息的状态;设计了能有效指导算法训练的奖励函数;实现了细粒度卸载,即任务卸载到服务器的虚拟机。实验结果表明,ND3QN在不同探索率和学习率下的收敛速度和收敛值存在明显差异,且在任务丢弃率、完成时间和费用等方面优于基线算法;消融实验证明了状态和奖励函数改进的有效性。因此,ND3QN可有效提升边云环境中的任务卸载和资源分配效率。Edge computing allows Internet of Things devices to offload tasks to the edge and cloud environments for execution to meet the task’s demand for resources.Due to the highly stochastic and dynamic nature of edge and cloud environments,heuristic algorithms and Q-table based reinforcement learning algorithms struggle to achieve efficient offloading decisions for heterogeneous tasks.Therefore,this paper proposed a novel deep reinforcement learning algorithm called novel dueling and double deep Q network(ND3QN)for efficient offloading of tasks and resource allocation in edge and cloud environments.ND3QN jointly optimized the completion time and cost,and innovatively constructed state containing dynamic information about the environments.It designed reward functions guiding the training of the algorithm efficiently and realized fine-grained offloading,i.e.,the offloading of tasks to the servers’virtual machines.The experimental results show that ND3QN has significant differences in convergence speed and convergence values under different exploration rates and learning rates,and outperforms the baseline algorithms in terms of task discard rate,completion time and cost.The ablation experiments prove the effectiveness of the state and reward function improvements.Therefore,ND3QN can effectively improve the efficiency of task offloading and resource allocation in edge and cloud environments.

关 键 词:深度强化学习 边缘计算 任务卸载 资源分配 深度Q网络 

分 类 号:TP393[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象