检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:韩易博 何睿智[1] 汤国建[1] 包为民[1,2] HAN Yibo;HE Ruizhi;TANG Guojian;BAO Weimin(College of Aerospace Science and Engineering,National University of Defense Technology,Changsha 410073,China;China Aerospace Science and Technology Corporation,Beijing 100048,China)
机构地区:[1]国防科技大学空天科学学院,长沙410073 [2]中国航天科技集团有限公司,北京100048
出 处:《宇航学报》2025年第2期366-377,共12页Journal of Astronautics
基 金:国家自然科学基金(92371203)。
摘 要:针对运载火箭非入轨段发生推力下降故障导致任务失败的问题,提出了一种强化学习计算制导方法。结合强化学习的滚动优化思想,将运载火箭非入轨段制导问题转为马尔科夫决策过程,在每个制导周期调节程序角指令以完成轨迹重构,由神经网络拟合映射关系保证动态决策效率。离线训练阶段,利用智能体与环境实时交互模拟运载火箭推力下降故障下的轨迹重构过程,期间智能体迭代自身策略;在线应用阶段,策略网络依据状态量生成程序角调整量,无须人为干预和精确模型信息即可实现飞行时序自主决策。仿真结果表明,强化学习计算制导方法兼顾了求解精度和计算效率,鲁棒性强,适用于运载火箭非入轨段制导和在线轨迹重构。A computational guidance method for launch vehicles,based on reinforcement learning,is proposed to address the issue of mission failure caused by thrust drop faults during the non-orbiting flight phase.The guidance problem is reformulated as a Markov decision process by integrating rolling horizon optimization within the reinforcement learning framework.At each guidance cycle,the program angle command is adjusted to achieve trajectory reconfiguration,while the computational efficiency of dynamic programming is ensured through the use of a neural network.During the offline training phase,real-time interactions between the agent and the environment are leveraged to simulate the trajectory reconfiguration process under thrust drop faults,allowing the agent to iteratively refine its policy.In the online application phase,the policy network generates program angle adjustments based on current observations,enabling autonomous decision-making for the flight timing sequence without the need for human intervention or an accurate model.Simulation results demonstrate that the proposed method strikes a balance between computational efficiency and solution accuracy,proving to be robust and applicable for guidance and trajectory reconfiguration during the non-orbiting flight phase.
关 键 词:运载火箭 推力下降故障 计算制导 强化学习 轨迹重构
分 类 号:V448.1[航空宇航科学与技术—飞行器设计]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.63