Deep reinforcement learning based integrated evasion and impact hierarchical intelligent policy of exo-atmospheric vehicles  

作  者:Leliang REN Weilin GUO Yong XIAN Zhenyu LIU Daqiao ZHANG Shaopeng LI 

机构地区:[1]Xi’an Research Institute of High Technology,Xi’an 710025,China [2]China Xi’an Satellite Control Center,Xi’an,710043,China [3]Department of Automation,Tsinghua University,Beijing 100084,China

出  处:《Chinese Journal of Aeronautics》2025年第1期409-426,共18页中国航空学报(英文版)

基  金:co-supported by the National Natural Science Foundation of China(No.62103432);the China Postdoctoral Science Foundation(No.284881);the Young Talent fund of University Association for Science and Technology in Shaanxi,China(No.20210108)。

摘  要:Exo-atmospheric vehicles are constrained by limited maneuverability,which leads to the contradiction between evasive maneuver and precision strike.To address the problem of Integrated Evasion and Impact(IEI)decision under multi-constraint conditions,a hierarchical intelligent decision-making method based on Deep Reinforcement Learning(DRL)was proposed.First,an intelligent decision-making framework of“DRL evasion decision”+“impact prediction guidance decision”was established:it takes the impact point deviation correction ability as the constraint and the maximum miss distance as the objective,and effectively solves the problem of poor decisionmaking effect caused by the large IEI decision space.Second,to solve the sparse reward problem faced by evasion decision-making,a hierarchical decision-making method consisting of maneuver timing decision and maneuver duration decision was proposed,and the corresponding Markov Decision Process(MDP)was designed.A detailed simulation experiment was designed to analyze the advantages and computational complexity of the proposed method.Simulation results show that the proposed model has good performance and low computational resource requirement.The minimum miss distance is 21.3 m under the condition of guaranteeing the impact point accuracy,and the single decision-making time is 4.086 ms on an STM32F407 single-chip microcomputer,which has engineering application value.

关 键 词:Exo-atmospheric vehicle Integrated evasion and impact Deep reinforcement learning Hierarchical intelligent policy Single-chip microcomputer Miss distance 

分 类 号:TJ7[兵器科学与技术—武器系统与运用工程] V448[航空宇航科学与技术—飞行器设计] TP18[自动化与计算机技术—控制理论与控制工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象