运载火箭推力下降故障下的强化学习计算制导方法

Reinforcement Learning Applied to Computational Guidance of Launch Vehicle under Thrust Drop Faults

作　　者：韩易博何睿智[1] 汤国建[1] 包为民[1,2] HAN Yibo;HE Ruizhi;TANG Guojian;BAO Weimin(College of Aerospace Science and Engineering,National University of Defense Technology,Changsha 410073,China;China Aerospace Science and Technology Corporation,Beijing 100048,China)

机构地区：[1]国防科技大学空天科学学院,长沙410073 [2]中国航天科技集团有限公司,北京100048

出　　处：《宇航学报》2025年第2期366-377,共12页Journal of Astronautics

基　　金：国家自然科学基金(92371203)。

摘　　要：针对运载火箭非入轨段发生推力下降故障导致任务失败的问题,提出了一种强化学习计算制导方法。结合强化学习的滚动优化思想,将运载火箭非入轨段制导问题转为马尔科夫决策过程,在每个制导周期调节程序角指令以完成轨迹重构,由神经网络拟合映射关系保证动态决策效率。离线训练阶段,利用智能体与环境实时交互模拟运载火箭推力下降故障下的轨迹重构过程,期间智能体迭代自身策略;在线应用阶段,策略网络依据状态量生成程序角调整量,无须人为干预和精确模型信息即可实现飞行时序自主决策。仿真结果表明,强化学习计算制导方法兼顾了求解精度和计算效率,鲁棒性强,适用于运载火箭非入轨段制导和在线轨迹重构。A computational guidance method for launch vehicles,based on reinforcement learning,is proposed to address the issue of mission failure caused by thrust drop faults during the non-orbiting flight phase.The guidance problem is reformulated as a Markov decision process by integrating rolling horizon optimization within the reinforcement learning framework.At each guidance cycle,the program angle command is adjusted to achieve trajectory reconfiguration,while the computational efficiency of dynamic programming is ensured through the use of a neural network.During the offline training phase,real-time interactions between the agent and the environment are leveraged to simulate the trajectory reconfiguration process under thrust drop faults,allowing the agent to iteratively refine its policy.In the online application phase,the policy network generates program angle adjustments based on current observations,enabling autonomous decision-making for the flight timing sequence without the need for human intervention or an accurate model.Simulation results demonstrate that the proposed method strikes a balance between computational efficiency and solution accuracy,proving to be robust and applicable for guidance and trajectory reconfiguration during the non-orbiting flight phase.

关键词：运载火箭推力下降故障计算制导强化学习轨迹重构

分类号：V448.1[航空宇航科学与技术—飞行器设计]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

运载火箭推力下降故障下的强化学习计算制导方法

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

运载火箭推力下降故障下的强化学习计算制导方法

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索