基于异步奖励深度确定性策略梯度的边缘计算多任务资源联合优化  被引量:3

Multi-tasks resource joint optimization based on asynchronous reward deep deterministic policy gradient in edge computing

在线阅读下载全文

作  者:周恒 李丽君[1] 董增寿[1] Zhou Heng;Li Lijun;Dong Zengshou(New Technology Research Laboratory of Intelligent Network,School of Electronic Information Engineering,Taiyuan University of Science&Technology,Taiyuan 030024,China)

机构地区:[1]太原科技大学电子信息工程学院智能网新技术研究实验室,太原030024

出  处:《计算机应用研究》2023年第5期1491-1496,共6页Application Research of Computers

基  金:山西省回国留学人员科研资助项目(2020-126,2021-134,2021-135);山西省重点研发计划资助项目(201903D121023);山西省基础研究计划面上项目(20210302123206)。

摘  要:移动边缘计算(MEC)系统中,因本地计算能力和电池能量不足,终端设备可以决定是否将延迟敏感性任务卸载到边缘节点中执行。针对卸载过程中用户任务随机产生且系统资源动态变化问题,提出了一种基于异步奖励的深度确定性策略梯度(asynchronous reward deep deterministic policy gradient, ARDDPG)算法。不同于传统独立任务资源分配采用顺序等待执行的策略,该算法在任务产生的时隙即可执行资源分配,不必等待上一个任务执行完毕,以异步模式获取任务计算奖励。ARDDPG算法在时延约束下联合优化了任务卸载决策、动态带宽分配和计算资源分配,并通过深度确定性策略梯度训练神经网络来探索最佳优化性能。仿真结果表明,与随机策略、基线策略和DQN算法相比,ARDDPG算法在不同时延约束和任务生成率下有效降低了任务丢弃率和系统的时延和能耗。In mobile edge computing(MEC)system,the terminal devices can decide whether to offload delay-sensitive tasks to edge nodes for execution due to insufficient local computing capacity and battery power.Aiming at the problem that user tasks randomly generated and system resources dynamically changed during the offloading process,this paper proposed an asynchronous reward deep deterministic policy gradient(ARDDPG)algorithm.Different from the traditional policy of sequential waiting for execution of independent task resource allocation,the ARDDPG algorithm could execute the resource allocation in the time slot of the task generation without waiting for the completion of the execution of the previous task,and obtained the task calculation reward in asynchronous mode.The algorithm jointly optimized the task offload decision,system bandwidth and computing resource allocation under time delay constraints,and trained the neural network with depth deterministic policy gradient to explore the optimal performance.Simulation results show that compared with random strategy,baseline strategy and DQN algorithm,the ARDDPG algorithm can effectively reduce the task discarding rate and the delay and energy consumption of the system under different delay constraints and task generation rates.

关 键 词:边缘计算 任务卸载 资源联合优化 动态带宽分配 DDPG 

分 类 号:TN915.07[电子电信—通信与信息系统]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象