基于多智能体深度强化学习的测运控一体化资源调度方法  被引量:3

Resource Scheduling Method for Integration of TT&C and Observation Based on Multi-Agent Deep Reinforcement Learning

在线阅读下载全文

作  者:成思玥 李浩然 白卫岗 周笛 朱彦 CHENG Siyue;LI Haoran;BAI Weigang;ZHOU Di;ZHU Yan(School of Telecommunications Engineering,Xidian University,Xi’an 710071,China)

机构地区:[1]西安电子科技大学通信工程学院,陕西西安710071

出  处:《天地一体化信息网络》2023年第1期12-22,共11页Space-Integrated-Ground Information Networks

基  金:国家重点研发计划资助项目(No.2020YFB1806100);国家自然科学基金青年项目(No.62101410);秦创原引用高层次创新创业人才项目(No.QCYRCXM-2022-228)。

摘  要:随着卫星通信技术的发展,星座规模的不断扩大,测运控一体化成为主流趋势。星座规模大、调度对象多、复杂操作联合控制给卫星网络测运控一体化资源调度带来巨大的挑战。受制于调度算法求解效率低、约束复杂等问题,传统的测运控资源调度技术采用提前上注测控指令,按照固定部署执行任务,难以满足突发事件与紧急任务的调度需求。因此,提出一种基于多智能体演员-评判家确定性策略梯度算法的测运控一体化资源调度方法,采用集中式训练和分布式执行的方法,建立测运控一体化任务的多智能体模型,通过分析邻居智能体局部信息计算调度策略,提高任务的响应速度。依据测运控一体化资源调度问题中的模型和约束,选择影响意义大、可解释的约束,建立多智能体资源调度强化学习模型,并进行仿真测试。测试结果显示,该方法的任务收益较传统方法提高22%。With the development of satellite communication technology and the continuous expansion of the constellation scale,the integration of TT&C and observation technology has become the mainstream trend.The large constellation scale,many scheduling objects and complex operation joint control bring great challenges to the integrated resource scheduling of satellite network TT&C and observation.Subject to the low solution effi ciency and complex constraints of scheduling algorithms,the traditional TT&C resource scheduling technology adopts the advance injection TT&C instructions to perform tasks according to the fi xed deployment,which is diffi cult to meet the scheduling needs of emergencies and emergency tasks.Therefore,a kind of resource scheduling method based on multi-agent actor-Agent Actor-Critic Deterministic Policy Gradient Algorithms(MADDPG)was presented.With centralized training and distributed execution,the multi-agent model of integrated task of TT&C and observation was established.By analyzed the scheduling strategy of neighbor agent,the response speed of local information was improved.According to the model and constraints in the integrated resource scheduling problem of TT&C and observation,selected signifi cant and interpretable constraints,then established the multi-agent resource scheduling reinforcement learning model,and carried on the simulation test.The simulation results showed that the task benefit of this method was 22%higher than the traditional method.

关 键 词:测运控一体化 大规模星座系统 资源调度 多智能体深度强化学习 任务收益 

分 类 号:V19[航空宇航科学与技术—人机与环境工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象