一种边缘低空系统中基于主动推理的深度强化学习算法  

An active inference based deep reinforcement learning algorithm for edge low-altitude systems

作  者:杜剑波 贡杰 王嘉煊 王玉婷 陈天赐 李树磊 DU Jianbo;GONG Jie;WANG Jiaxuan;WANG Yuting;CHEN Tianci;LI Shulei(School of Communications and Information Engineering,Xi’an University of Posts and Telecommunications,Xi’an 710121,China;Shaanxi Key Laboratory of Information Communication Network and Security,Xi’an 710121,China;School of Telecommunications Engineering,Xidian University,Xi’an 710071,China;Tianyuan Ruixin Communication Technology Co.LTD.,Xi’an 710119,China)

机构地区:[1]西安邮电大学通信与信息工程学院,陕西西安710121 [2]陕西省信息通信网络及安全重点实验室,陕西西安710121 [3]西安电子科技大学通信工程学院,陕西西安710071 [4]天元瑞信通信技术股份有限公司,陕西西安710119

出  处:《西安邮电大学学报》2025年第1期9-18,共10页Journal of Xi’an University of Posts and Telecommunications

基  金:国家自然科学基金项目(62271391,62471388,62371392);广东省人工智能与数字经济实验室(深圳)开放研究基金项目(GML-KF-24-34);陕西省教育厅服务地方专项科研项目(21JC032);陕西省国际科技合作专项项目(2023-GHZD-37);陕西省重点产业链项目(2023ZDLGY-49,2024GX-ZDCYL-05-01);陕西省秦创原“科学家+工程师”团队建设项目(2024QCY-KXG-156)。

摘  要:为了降低低空边缘系统中的系统开销和优化用户体验质量(Quality of Experience,QoE),提出一种边缘低空系统中基于主动推理的深度强化学习(Active Inference Enabled Deep Reinforcement Learning,ADRL)算法。构建一个无人机(Unmanned Aerial Vehicle,UAV)辅助的多接入边缘计算(Multi-Access Edge Computing,MEC)系统网络模型,将边缘服务器部署在具有为用户提供卸载计算服务和内容缓存服务的UAV上。在考虑UAV算力资源限制的情况下,将最小化用户开销和最大化用户体验质量作为优化目标构建优化问题,并将该问题转化为马尔可夫决策过程,以实现任务卸载、内容缓存及资源分配。将所提算法与无物体缓存算法和UAV带宽平均分配算法在QoE、系统开销及实时奖励等进行对比,仿真结果表明,所提算法将用户开销相对基准算法降低了约13%,用户QoE相对基准算法提高了约14%。To reduce system overhead and optimize user’s quality of experience(QoE)in edge low-altitude systems,an active inference based deep reinforcement learning(ADRL)algorithm is proposed.A network model of an unmanned aerial vehicle(UAV)-assisted multi-access edge computing(MEC)system is constructed,where edge servers are deployed on UAVs that provide users with offloading computing services and content caching services.Considering the limitations of UAV computing and networking resources,the optimization problem is formulated with the goals of minimizing user’s cost and maximizing user’s QoE.This problem is then transformed into a Markov decision process(MDP)to achieve task offloading,content caching,and resource allocation.The proposed algorithm is compared with algorithms that have no object caching and with UAV bandwidth average allocation algorithms in terms of QoE,system overhead,and real-time rewards.Simulation results show that the proposed algorithm reduces user cost by approximately 13%compared to the benchmark algorithm,and improves user’s QoE by around 14%.

关 键 词:无人机 多接入边缘计算 深度强化学习 马尔可夫决策过程 内容缓存 任务卸载 

分 类 号:TN929.5[电子电信—通信与信息系统] TP18[电子电信—信息与通信工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象