基于改进型DDPG的单用户任务迁移优化

Optimization of single-user task migration based on improved DDPG

作　　者：胡灿朱正伟[1] 朱晨阳[2] 诸燕平[1] HU Can;ZHU Zheng-wei;ZHU Chen-yang;ZHU Yan-ping(School of Microelectronics and Control Engineering,Changzhou University,Changzhou 213164,China;School of Computer Science and Artificial Intelligence,Changzhou University,Changzhou 213164,China)

机构地区：[1]常州大学微电子与控制工程学院,江苏常州213164 [2]常州大学计算机与人工智能学院,江苏常州213164

出　　处：《计算机工程与设计》2023年第11期3352-3359,共8页Computer Engineering and Design

基　　金：国家自然科学基金项目(61801055);常州市重点研发计划基金项目(CJ20210123);江苏省研究生科研创新基金项目(KYCX22_3053、KYCX22_3060);江苏省高等学校自然科学面上基金项目(22KJB520012)。

摘　　要：针对传统强化学习算法在具有随机任务到达和时变无线信道的边缘服务器上最小化计算成本存在收敛速度慢、收敛不稳定等问题,提出一种改进型DDPG算法(IDDPG)。将DDPG的Critic网络结构替换为Dueling结构,通过将状态价值函数拆分为优势函数和价值函数,使其收敛更快;将Critic网络的更新频率调整为高于Actor网络的更新频率,使整体训练更加稳定;在Actor网络选出来的动作上增加Ornstein-Uhlenbeck噪声以提高算法探索能力,将动作噪声大小进行分段设置,保证收敛的稳定性。实验结果表明,相较其它算法,IDDPG算法能够更好最小化计算成本,在收敛速度和收敛稳定性方面都有一定提升。Aiming at the problems of slow convergence and unstable convergence of traditional reinforcement learning algorithms in minimizing computational cost on edge servers with random task arrivals and time-varying wireless channels,an improved DDPG algorithm(IDDPG)was proposed.The Critic network structure of DDPG was replaced by the Dueling structure,which converged faster by splitting the state value function into an advantage function and a value function.The update frequency of the Critic network was adjusted to be higher than that of the Actor network to make the overall training more stable.The Ornstein-Uhlenbeck noise was added to the actions selected through the Actor network to improve the algorithm exploration ability,and the action noise size was set in segments to ensure the stability of convergence.Experimental results show that,compared with other algorithms,the IDDPG algorithm can better minimize the computational cost,and has a certain improvement in the convergence speed and convergence stability.

关键词：深度强化学习边缘计算任务卸载策略优化网络结构算法优化探索噪声

分类号：TP181[自动化与计算机技术—控制理论与控制工程]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于改进型DDPG的单用户任务迁移优化

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于改进型DDPG的单用户任务迁移优化

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索