检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:舒健生 周于翔 郑晓龙 赖晓昌 陶大甜 SHU Jiansheng;ZHOU Yuxiang;ZHENG Xiaolong;LAI Xiaochang;TAO Datian(Rocket Force University of Engineering,Xi’an 710025,China;School of Information Engineering,Wuhan University of Technology,Wuhan 430070,China)
机构地区:[1]火箭军工程大学,西安710025 [2]武汉理工大学信息工程学院,武汉430070
出 处:《火力与指挥控制》2023年第12期133-141,共9页Fire Control & Command Control
摘 要:随着无人机技术的应用和发展,无人机执行任务的飞行环境愈发复杂多变,对无人机机动避障能力和航迹规划的实时性提出了更高的要求。基于泛化性较好、对环境依赖弱的深度强化学习算法,以雷达实时获取的障碍物地图信息为基础进行实时路径规划,针对二维航迹规划问题特点设计了连续奖励函数,解决了强化学习算法在二维平面航迹规划中奖励稀疏的问题;基于迁移学习的思想设计多个训练环境,并按任务的难易程度进行分步训练,降低了算法的训练难度,提高了训练效果,并使算法的收敛效果更加稳定。在实验中将SAC算法与目前主流的PPO和TD3算法进行对比,实验结果表明:SAC算法收敛速度快,实时性好,航迹平滑度更好。With the application and development of UAV technology,the flight environment of UAV is becoming more and more complex and changeable,the higher requirements for the obstacle avoidance ability and real-time path planning of UAVs are proposed.Based on the deep reinforcement learning algorithm with good generalization and weak dependence on the environment,the real-time path planning is carried out based on the obstacle map information obtained by radar in real time.The continuous reward function is designed according to the characteristics of the two-dimensional path planning problem,the problem of sparse reward in the two-dimensional plane path planning of the reinforcement learning algorithm is solved.Based on the idea of transfer learning,multiple training environments are designed and trained step by step according to the difficulty of the task,which reduces the training difficulty of the algorithm,improves the training effects and makes the convergence effects of the algorithm more stable.Finally,the SAC algorithm is compared with the current mainstream PPO and TD3 algorithms in the experiment.The experimental results show that the SAC algorithm has fast convergence speed,good real-time performance and better track smoothness.
分 类 号:TP273[自动化与计算机技术—检测技术与自动化装置]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.38