检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:薛乃阳 丁丹 贾玉童 王志强 刘渊 Xue Naiyang;Ding Dan;Jia Yutong;Wang Zhiqiang;Liu Yuan(Graduate School,Space Engineering University,Beijing 101416,China;Department of Electronic and Optical Engineering,Space Engineering University,Beijing 101416,China;PLA 61646 Troops,Beijing 100192,China)
机构地区:[1]航天工程大学研究生院,北京101416 [2]航天工程大学电子与光学工程系,北京101416 [3]中国人民解放军61646部队,北京100192
出 处:《系统仿真学报》2023年第2期423-434,共12页Journal of System Simulation
摘 要:以异构测控网资源联合调度为研究对象,提出一种基于强化学习的深度Q网络(deep Q network, DQN)算法。在充分分析异构测控资源联合调度问题特点后,用数学语言对影响问题求解的约束条件进行描述,建立了资源联合调度模型;从应用强化学习解决问题的角度,对求解的问题进行马尔科夫决策过程描述后,分别设计了2个结构相同的神经网络和基于ε贪婪算法的动作选择策略,并建立了DQN求解框架。仿真结果表明:基于DQN的异构测控资源调度方法较遗传算法能够找到调度收益更优的测控调度方案。Joint scheduling of heterogeneous TT&C resources as research object, a deep Q network(DQN) algorithm based on reinforcement learning is proposed. The characteristics of the joint scheduling problem of heterogeneous TT&C resources being fully analyzied and mathematical language being used to describe the constraints affecting the solution, a resource joint scheduling model is established. From the perspective of applying reinforcement learning, two neural networks with the same structure and the action selection strategies based on ε greedy algorithm are respectively designed after Markov decision process description, and DQN solution framework is established. The simulation results show that DQNbased heterogeneous TT&C resources scheduling method can identify a TT&C scheduling scheme with better scheduling revenue than the genetic algorithm.
关 键 词:航天测控 异构测控资源联合调度 深度Q网络 调度收益 强化学习
分 类 号:TP273.1[自动化与计算机技术—检测技术与自动化装置]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.136.17.231