检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:Xiaoyi Zhou Liang Huang Tong Ye Weiqiang Sun
机构地区:[1]State Key Laboratory of Advanced Optical Communication Systems and Networks,Shanghai Jiao Tong University,Shanghai 200240,China [2]College of Computer Science and Technology,Zhejiang University of Technology,Hangzhou 310058,China
出 处:《Digital Communications and Networks》2024年第6期1769-1781,共13页数字通信与网络(英文版)
基 金:supported in part by the National Natural Science Foundation of China under Grant 62271306,Grant 62072410,and Grant 62331017;in part by the Fundamental Research Funds for the Provincial Universities of Zhejiang under Grant RF-B2022002。
摘 要:This paper investigates the multi-Unmanned Aerial Vehicle(UAV)-assisted wireless-powered Mobile Edge Computing(MEC)system,where UAVs provide computation and powering services to mobile terminals.We aim to maximize the number of completed computation tasks by jointly optimizing the offloading decisions of all terminals and the trajectory planning of all UAVs.The action space of the system is extremely large and grows exponentially with the number of UAVs.In this case,single-agent learning will require an overlarge neural network,resulting in insufficient exploration.However,the offloading decisions and trajectory planning are two subproblems performed by different executants,providing an opportunity for problem-solving.We thus adopt the idea of decomposition and propose a 2-Tiered Multi-agent Soft Actor-Critic(2T-MSAC)algorithm,decomposing a single neural network into multiple small-scale networks.In the first tier,a single agent is used for offloading decisions,and an online pretrained model based on imitation learning is specially designed to accelerate the training process of this agent.In the second tier,UAVs utilize multiple agents to plan their trajectories.Each agent exerts its influence on the parameter update of other agents through actions and rewards,thereby achieving joint optimization.Simulation results demonstrate that the proposed algorithm can be applied to scenarios with various location distributions of terminals,outperforming existing benchmarks that perform well only in specific scenarios.In particular,2T-MSAC increases the number of completed tasks by 45.5%in the scenario with uneven terminal distributions.Moreover,the pretrained model based on imitation learning reduces the convergence time of 2T-MSAC by 58.2%.
关 键 词:Mobile-edge computing Multi-agent reinforcement learning Offloading decision Trajectory planning Unmanned aerial vehicle Wireless power transfer
分 类 号:TN929.5[电子电信—通信与信息系统] TP18[电子电信—信息与通信工程] V19[自动化与计算机技术—控制理论与控制工程]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.249