Recorded recurrent deep reinforcement learning guidance laws for intercepting endoatmospheric maneuvering missiles  

在线阅读下载全文

作  者:Xiaoqi Qiu Peng Lai Changsheng Gao Wuxing Jing 

机构地区:[1]Department of Aerospace Engineering,Harbin Institute of Technology,Harbin,150001,China [2]Shanghai Electro-Mechanical Engineering Institute,Shanghai Academy of Spaceflight Technology,Shanghai,201100,China

出  处:《Defence Technology(防务技术)》2024年第1期457-470,共14页Defence Technology

基  金:supported by the National Natural Science Foundation of China(Grant No.12072090)。

摘  要:This work proposes a recorded recurrent twin delayed deep deterministic(RRTD3)policy gradient algorithm to solve the challenge of constructing guidance laws for intercepting endoatmospheric maneuvering missiles with uncertainties and observation noise.The attack-defense engagement scenario is modeled as a partially observable Markov decision process(POMDP).Given the benefits of recurrent neural networks(RNNs)in processing sequence information,an RNN layer is incorporated into the agent’s policy network to alleviate the bottleneck of traditional deep reinforcement learning methods while dealing with POMDPs.The measurements from the interceptor’s seeker during each guidance cycle are combined into one sequence as the input to the policy network since the detection frequency of an interceptor is usually higher than its guidance frequency.During training,the hidden states of the RNN layer in the policy network are recorded to overcome the partially observable problem that this RNN layer causes inside the agent.The training curves show that the proposed RRTD3 successfully enhances data efficiency,training speed,and training stability.The test results confirm the advantages of the RRTD3-based guidance laws over some conventional guidance laws.

关 键 词:Endoatmospheric interception Missile guidance Reinforcement learning Markov decision process Recurrent neural networks 

分 类 号:TP18[自动化与计算机技术—控制理论与控制工程] TJ765.3[自动化与计算机技术—控制科学与工程] TJ761.7[兵器科学与技术—武器系统与运用工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象