基于自适应增强随机搜索的航天器追逃博弈策略研究  

Research on game strategy of spacecraft chase and escape based on adaptive augmented random search

在线阅读下载全文

作  者:焦杰 苟永杰 吴文博 泮斌峰[1,2] JIAO Jie;GOU Yongjie;WU Wenbo;PAN Binfeng(School of Astronautics,Northwestern Polytechnical University,Xi′an 710072,China;National Key Laboratory of Aerospace Flight Dynamics,Xi′an 710072,China;Shanghai Aerospace Systems Engineering Institute,Shanghai 201108,China)

机构地区:[1]西北工业大学航天学院,陕西西安710072 [2]航天飞行动力学技术国家级重点实验室,陕西西安710072 [3]上海宇航系统工程研究所,上海201108

出  处:《西北工业大学学报》2024年第1期117-128,共12页Journal of Northwestern Polytechnical University

摘  要:针对航天器与非合作目标追逃博弈的生存型微分对策拦截问题,基于强化学习研究了追逃博弈策略,提出了自适应增强随机搜索(adaptive-augmented random search,A-ARS)算法。针对序贯决策的稀疏奖励难题,设计了基于策略参数空间扰动的探索方法,加快策略收敛速度;针对可能过早陷入局部最优问题设计了新颖度函数并引导策略更新,可提升数据利用效率;通过数值仿真验证并与增强随机搜索(augmented random search,ARS)、近端策略优化算法(proximal policy optimization,PPO)以及深度确定性策略梯度下降算法(deep deterministic policy gradient,DDPG)进行对比,验证了此方法的有效性和先进性。To solve the problem of the survival differential policy interception between a spacecraft and a non-cooperative target pursuit game,the pursuit game policy is studied based on reinforcement learning,and the adaptive-augmented random search algorithm is proposed.Firstly,to solve the sparse reward problem of sequential decision making,an exploration method based on the spatial perturbation of parameters of the policy is designed,thus accelerating its convergence speed.Secondly,to avoid the possibility of falling into local optimum prematurely,a novelty degree function is designed to guide the policy update,enhancing the efficiency of data utilization.Finally,the effectiveness and advancement of the exploration method are verified with numerical simulations and compared with those of the augmented random search algorithm,the proximal policy optimization algorithm and the deep deterministic policy gradient algorithm.

关 键 词:非合作目标 追逃博弈 微分对策 强化学习 稀疏奖励 

分 类 号:V448.2[航空宇航科学与技术—飞行器设计]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象