检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:李亚辉[1] 黄建友 王辰琳 周国峰[1] 韦常柱[2] LI Yahui;HUANG Jianyou;WANG Chenlin;ZHOU Guofeng;WEI Changzhu(China Academy of Launch Vehicle Technology,Beijing 100076,China;Harbin Institute of Technology,Harbin 150001,China)
机构地区:[1]中国运载火箭技术研究院,北京100076 [2]哈尔滨工业大学,黑龙江哈尔滨150001
出 处:《飞行力学》2024年第6期50-54,62,共6页Flight Dynamics
摘 要:针对多飞行器协同规划需求,提出了一种基于深度强化学习框架的在线航迹规划方法。根据航迹规划的目标和约束条件构建强化学习基本元素,使航迹规划智能体在随机初始化的不同环境中探索,并通过深度确定性策略梯度(DDPG)算法逐步改善规划策略。仿真结果表明,经7万次训练后智能体奖励值收敛;针对协同侦察、汇合、快速到达等不同任务需求,所提方法可在4 s内规划出满足约束的可行路径,具有较好的工程应用前景。A novel on-line trajectory planning algorithm based on deep reinforcement learning frameworks is presented for the requirements of multi-vehicle collaborative planning.According to the targets and constraints of trajectory planning,the basic elements of reinforcement learning are constructed to allow the trajectory planning agent to explore in different environments with random initialization,and the planning strategy is gradually improved by the deep deterministic policy gradient(DDPG)algorithm.Simulation results show that the reward value of the agent converges after 70000 times of training.According to the requirements of different tasks such as cooperative reconnaissance,convergence and rapid arrival,the proposed method can plan a feasible trajectory to meet the constraints within 4 s,and has a good engineering application prospect.
关 键 词:飞行器集群 航迹规划算法 深度强化学习 协同规划
分 类 号:V249[航空宇航科学与技术—飞行器设计] V448
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.143.255.34