基于深度确定性策略梯度学习的多飞行器协同航迹规划  

Novel collaborative planning based on DDPG learning for multi-vehicle

在线阅读下载全文

作  者:李亚辉[1] 黄建友 王辰琳 周国峰[1] 韦常柱[2] LI Yahui;HUANG Jianyou;WANG Chenlin;ZHOU Guofeng;WEI Changzhu(China Academy of Launch Vehicle Technology,Beijing 100076,China;Harbin Institute of Technology,Harbin 150001,China)

机构地区:[1]中国运载火箭技术研究院,北京100076 [2]哈尔滨工业大学,黑龙江哈尔滨150001

出  处:《飞行力学》2024年第6期50-54,62,共6页Flight Dynamics

摘  要:针对多飞行器协同规划需求,提出了一种基于深度强化学习框架的在线航迹规划方法。根据航迹规划的目标和约束条件构建强化学习基本元素,使航迹规划智能体在随机初始化的不同环境中探索,并通过深度确定性策略梯度(DDPG)算法逐步改善规划策略。仿真结果表明,经7万次训练后智能体奖励值收敛;针对协同侦察、汇合、快速到达等不同任务需求,所提方法可在4 s内规划出满足约束的可行路径,具有较好的工程应用前景。A novel on-line trajectory planning algorithm based on deep reinforcement learning frameworks is presented for the requirements of multi-vehicle collaborative planning.According to the targets and constraints of trajectory planning,the basic elements of reinforcement learning are constructed to allow the trajectory planning agent to explore in different environments with random initialization,and the planning strategy is gradually improved by the deep deterministic policy gradient(DDPG)algorithm.Simulation results show that the reward value of the agent converges after 70000 times of training.According to the requirements of different tasks such as cooperative reconnaissance,convergence and rapid arrival,the proposed method can plan a feasible trajectory to meet the constraints within 4 s,and has a good engineering application prospect.

关 键 词:飞行器集群 航迹规划算法 深度强化学习 协同规划 

分 类 号:V249[航空宇航科学与技术—飞行器设计] V448

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象