检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:卢先领[1] 李德康 LU Xianling;LI Dekang(School of Internet of Things,Jiangnan University,Wuxi 214122,China)
出 处:《电子与信息学报》2025年第1期116-127,共12页Journal of Electronics & Information Technology
基 金:国家自然科学基金(61773181)。
摘 要:基于单智能体强化学习的任务卸载算法在解决大规模多接入边缘计算(MEC)系统任务卸载时,存在智能体之间相互影响,策略退化的问题。而以多智能体深度确定性策略梯度(MADDPG)为代表的传统多智能体算法的联合动作空间维度随着系统内智能体的数量增加而成比例增加,导致系统扩展性变差。为解决以上问题,该文将大规模多接入边缘计算任务卸载问题,描述为部分可观测马尔可夫决策过程(POMDP),提出基于平均场多智能体的任务卸载算法。通过引入长短期记忆网络(LSTM)解决局部观测问题,引入平均场近似理论降低联合动作空间维度。仿真结果表明,所提算法在任务时延与任务掉线率上的性能优于单智能体任务卸载算法,并且在降低联合动作空间的维度情况下,任务时延与任务掉线率上的性能与MADDPG一致。Objective Recently,task offloading techniques based on reinforcement learning in Multi-access Edge Computing(MEC)have attracted considerable attention and are increasingly being utilized in industrial applications.Algorithms for task offloading that rely on single-agent reinforcement learning are typically developed within a decentralized framework,which is preferred due to its relatively low computational complexity.However,in large-scale MEC environments,such task offloading policies are formed solely based on local observations,often resulting in partial observability challenges.Consequently,this can lead to interference among agents and a degradation of the offloading policies.In contrast,traditional multi-agent reinforcement learning algorithms,such as the Multi-Agent Deep Deterministic Policy Gradient(MADDPG),consolidate the observation and action vectors of all agents,thereby effectively addressing the partial observability issue.Optimal joint offloading policies are subsequently derived through online training.Nonetheless,the centralized training and decentralized execution model inherent in MADDPG causes computational complexity to increase linearly with the number of mobile devices(MDs).This scalability issue restricts the ability of MEC systems to accommodate additional devices,ultimately undermining the system’s overall scalability.Methods First,a task offloading queue model for large-scale MEC systems is developed to handle delay-sensitive tasks with deadlines.This model incorporates both the transmission process,where tasks are offloaded via wireless channels to the edge server,and the computation process,where tasks are processed on the edge server.Second,the offloading process is defined as a Partially Observable Markov Decision Process(POMDP)with specified observation space,action space,and reward function for the agents.The Mean-Field Multi-Agent Task Offloading(MF-MATO)algorithm is subsequently proposed.Long Short-Term Memory(LSTM)networks are utilized to predict the current state vector of t
关 键 词:多接入边缘计算 任务卸载 强化学习 多智能体算法 平均场近似理论
分 类 号:TN929.5[电子电信—通信与信息系统]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:18.188.195.92