基于深度强化学习的空天地一体化网络信息物理系统垂直切换策略

Vertical handover policy for cyber-physical systems aided by SAGIN based on deep reinforcement learning

作　　者：武艳[1] 潘广川姚明旿[1] 杨清海[1] 梁中明 WU Yan;PAN Guangchuan;YAO Mingwu;YANG Qinghai;LEUNG Victor C.M.(State Key Laboratory of Integrated Services Network,School of Telecommunications,Xidian University,Xi’an 710071,China;Internet of Things Research Center,School of Computer and Software,Shenzhen University,Shenzhen 518060,China)

机构地区：[1]西安电子科技大学空天地一体化综合业务网全国重点实验室,陕西西安710071 [2]深圳大学计算机与软件学院物联网研究中心,广东深圳518060

出　　处：《通信学报》2024年第8期180-191,共12页Journal on Communications

基　　金：国家重点研发计划基金资助项目(No.2020YFB1807700);陕西省创新团队基金资助项目(No.2024RS-CXTD-01)。

摘　　要：针对空天地一体化网络信息物理系统模型复杂、很难获得网络拓扑先验知识和模型化假设的特点,研究其基于深度强化学习的垂直切换策略。首先,综合考虑系统稳定性、切换开销和网络使用成本约束,将垂直切换策略问题建模为约束马尔可夫决策过程(CMDP),并给出保证可行解存在的充分条件;其次,提出约束-近端策略优化(CPPO)算法解决该问题,并在基站侧引入分布式强化学习机制加速训练收敛。相较于基准策略,仿真验证了所提垂直切换策略的优越性和有效性。The vertical handover policy of space-air-ground integrated cyber-physical systems based on deep reinforcement learning was studied,in which the challenges of complicated network model and difficulties in acquiring prior knowledge for network topology and model were addressed.By jointly taking the system stability,handover cost and network-using cost into account,the vertical handover policy problem was modeled as a constraint Markov decision process(CMDP),and a sufficient condition to ensure the existence of a feasible solution was derived.Furthermore,a constraint-proximal policy optimization(CPPO)algorithm was proposed to solve the CMDP,and also the distributed learning scheme at base station sides was introduced to accelerate the speed of converging.Simulation results verify the validation and superiority of the proposed vertical handover policy as compared with the baselines.

关键词：空天地一体化网络信息物理系统深度强化学习垂直切换

分类号：TN929.5[电子电信—通信与信息系统]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于深度强化学习的空天地一体化网络信息物理系统垂直切换策略

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于深度强化学习的空天地一体化网络信息物理系统垂直切换策略

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索