机构地区:[1]广西大学计算机与电子信息学院,南宁530004
出 处:《小型微型计算机系统》2025年第4期863-875,共13页Journal of Chinese Computer Systems
基 金:国家自然科学基金项目(62062007,62272198)资助.
摘 要:在云边协同计算环境中,计算任务的卸载决策问题是当前的研究热点.现有的方案通常采用单智能体强化学习算法求解该问题,存在鲁棒性低、决策空间过大等缺陷,且未考虑用户移动性、奖励延迟性以及信息观测与同步问题.针对上述不足,本文提出一种考虑设备局部观测能力的云边协同网络模型及任务计算队列和传输队列模型,并设计一种基于“面向任务”的多智能体强化学习的分布式卸载方案.首先,该方案给出信息同步协议以便设备获取网络全局状态,同时设计任务卸载调度规则以规定服务器在用户跨区移动和线路故障等场景下的计算和调度流程.然后,该方案以边缘服务器为智能体构建基于Actor-Critic框架的多智能体系统,给出智能体之间的协作方法,同时考虑线路故障时智能体的独立工作问题.随后为解决奖励延迟问题,本文将卸载决策问题建模为一种“面向任务”的马尔可夫决策过程,摒弃了常用的等距时隙模型,转而以任务处理时间为步长,采用动态且并行的时隙.最后以此过程为数学基础,本文提出一种任务卸载决策算法TOMAC-A2C.该算法利用多智能体强化学习思想,给出智能体之间协作完成卸载工作并相互评价以更新神经网络参数的方法,同时引入长短期记忆网络以对用户的移动性进行记忆和预测.基于来自现实世界的安卓设备移动情况数据集的实验结果表明,本文所提出的分布式卸载决策方案在面临高负载和高线路故障率时均能有效降低服务时延、能耗及任务丢弃率.In cloud-edge computing,the offloading of computational tasks is a current hotspot of research.Many existing solutions adopted single-agent reinforcement learning to solve this problem,which has defects such as low robustness,large decision space and so on.Moreover,these approaches failed to consider user mobility,delayed rewards,and information observation and synchronization.To address the deficiencies above,this paper proposes a cloud-edge computing network model that takes into account the limited observation capability of devices,along with the task computing queue and transmission queue model.More importantly,this paper designs a decentralized offloading scheme based on“task-oriented”multi-agent reinforcement learning.First,this scheme presents information synchronization protocol to enable devices to acquire global network status.Simultaneously,it designs task offloading scheduling rules to define the computation and scheduling process of servers in scenarios such as user mobility and line failures.Then,this scheme constructs a multi-agent system based on the Actor-Critic framework by considering edge servers as agents.The system presents a cooperative method for the agents and considers the independent work of agents in the event of line failures.Next,to solve the issue about delayed rewards,we model the offloading problem as a“task-oriented”Markov decision process,departing from the commonly used equidistant time slot model.Instead,we adopt dynamic and parallel time slot approach with the task processing time as the step size.Lastly,based on this process as a mathematical foundation,this paper proposes an offloading decision algorithm TOMAC-A2C.Utilizing the theory of multi-agent reinforcement learning,the algorithm provides a method for agents to cooperatively complete offloading work and evaluate each other to update the parameters of their neural network.It also incorporates long short-term memory networks for user mobility tracking and prediction.The experimental results based on the real-wor
关 键 词:移动边缘计算 任务卸载 深度强化学习 多智能体 面向任务
分 类 号:TP393[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...