检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:王浩凝 郭杰[1] 张宝超 王子瑶 唐胜景[1] 李响[1] WANG Haoning;GUO Jie;ZHANG Baochao;WANG Ziyao;TANG Shengjing;LI Xiang(School of Aerospace Engineering,Beijing Institute of Technology,Beijing 100081,China;Beijing Institute of Astronautical Systems Engineering,Beijing 100076,China)
机构地区:[1]北京理工大学宇航学院,北京100081 [2]北京宇航系统工程研究所,北京100076
出 处:《宇航学报》2024年第9期1429-1444,共16页Journal of Astronautics
摘 要:针对高超声速滑翔飞行器再入过程中遭遇多个未知威胁的规避突防需求,提出了一种多禁飞区在线遭遇的自主规避再入制导方法。将多个在线遭遇的禁飞区连续规避问题抽象为序贯决策问题,设计了一种基于强化学习的解决方案以提高飞行器的自主规避能力。充分考虑强化学习智能体的泛化能力和训练效率,建立了禁飞区规避问题的马尔科夫决策过程。在此基础上,设计基于模糊控制策略的多智能体协调决策方法,为每一个在线遭遇的禁飞区分配航向决策智能体进行独立航向决策,根据实时环境评估各禁飞区的威胁程度并协调生成航向指令。理论分析和数值仿真表明,此方法能够使飞行器在满足终端约束和过程约束条件下,在多个在线遭遇的禁飞区场景中实现有效规避,具有良好的鲁棒性和泛化能力。Considering the need for avoiding multiple unknown threats during the entry process of hypersonic glide vehicles,an autonomous entry guidance method is proposed for online encountering with multiple no-fly zones.The problem of sequentially avoiding multiple no-fly zones encountered in flight is treated as a sequential decision-making problem.A solution based on reinforcement learning is designed to enhance the autonomous capability of the vehicle.The Markov decision process for the no-fly zone avoidance problem is formulated,taking into account both the generalization capability and training efficiency of the reinforcement learning agent.Furthermore,a multi-agent coordination decision-making method is developed using a fuzzy control strategy.This method assigns a heading decision-making agent to each onlinedetected no-fly zone,making independent heading decisions.The method conducts real-time environmental assessments to evaluate the threat level of each no-fly zone and coordinates the generation of heading commands.Theoretical analysis and numerical simulations demonstrate that the proposed method enables effective avoidance of multiple no-fly zones encountered in flight,satisfying both terminal and process constraints.The method exhibits robustness and generalization capabilities,showcasing its effectiveness in diverse scenarios.
关 键 词:再入制导 多禁飞区规避 强化学习 模糊控制 在线自主决策
分 类 号:V448.235[航空宇航科学与技术—飞行器设计]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.7