基于深度强化学习的无人机实时航迹规划  被引量:1

Deep Reinforcement Learning-based UAV Real-time Trajectory Planning

在线阅读下载全文

作  者:舒健生 周于翔 郑晓龙 赖晓昌 陶大甜 SHU Jiansheng;ZHOU Yuxiang;ZHENG Xiaolong;LAI Xiaochang;TAO Datian(Rocket Force University of Engineering,Xi’an 710025,China;School of Information Engineering,Wuhan University of Technology,Wuhan 430070,China)

机构地区:[1]火箭军工程大学,西安710025 [2]武汉理工大学信息工程学院,武汉430070

出  处:《火力与指挥控制》2023年第12期133-141,共9页Fire Control & Command Control

摘  要:随着无人机技术的应用和发展,无人机执行任务的飞行环境愈发复杂多变,对无人机机动避障能力和航迹规划的实时性提出了更高的要求。基于泛化性较好、对环境依赖弱的深度强化学习算法,以雷达实时获取的障碍物地图信息为基础进行实时路径规划,针对二维航迹规划问题特点设计了连续奖励函数,解决了强化学习算法在二维平面航迹规划中奖励稀疏的问题;基于迁移学习的思想设计多个训练环境,并按任务的难易程度进行分步训练,降低了算法的训练难度,提高了训练效果,并使算法的收敛效果更加稳定。在实验中将SAC算法与目前主流的PPO和TD3算法进行对比,实验结果表明:SAC算法收敛速度快,实时性好,航迹平滑度更好。With the application and development of UAV technology,the flight environment of UAV is becoming more and more complex and changeable,the higher requirements for the obstacle avoidance ability and real-time path planning of UAVs are proposed.Based on the deep reinforcement learning algorithm with good generalization and weak dependence on the environment,the real-time path planning is carried out based on the obstacle map information obtained by radar in real time.The continuous reward function is designed according to the characteristics of the two-dimensional path planning problem,the problem of sparse reward in the two-dimensional plane path planning of the reinforcement learning algorithm is solved.Based on the idea of transfer learning,multiple training environments are designed and trained step by step according to the difficulty of the task,which reduces the training difficulty of the algorithm,improves the training effects and makes the convergence effects of the algorithm more stable.Finally,the SAC algorithm is compared with the current mainstream PPO and TD3 algorithms in the experiment.The experimental results show that the SAC algorithm has fast convergence speed,good real-time performance and better track smoothness.

关 键 词:无人机 SAC算法 二维平面规划 实时航迹规划 

分 类 号:TP273[自动化与计算机技术—检测技术与自动化装置]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象