对抗条件下基于SAC-Lagrangian的UAV智能规划  

UAV Intelligent Mission Planning Based on SAC-Lagrangian Under Confrontation Conditions

在线阅读下载全文

作  者:岳龙飞 杨任农[2] 闫孟达 赵小茹[2] 左家亮[2] 刘会亮 张明元[1] YUE Longfei;YANG Rennong;YAN Mengda;ZHAO Xiaoru;ZUO Jialiang;LIU Huiliang;ZHANG Mingyuan(National Key Laboratory of Electromagnetic,Naval University of Engineering,Wuhan 430000,China;Air Traffic Control and Navigation College,Air Force Engineering University,Xi'an 710000,China;Xi'an Satellite Control Center,Xi'an 710000,China)

机构地区:[1]海军工程大学电磁能技术全国重点实验室,武汉430000 [2]空军工程大学空管领航学院,西安710000 [3]西安卫星测控中心,西安710000

出  处:《电光与控制》2024年第8期1-7,共7页Electronics Optics & Control

基  金:国家自然科学基金(62106284);陕西省自然科学基金(2021 JQ-370)。

摘  要:无人机因其低成本、可消耗、分布部署、敏捷灵活的优势,在多个民用领域大放异彩。但受其智能化程度限制,如何在复杂对抗条件下自主安全完成任务仍存在巨大挑战。针对目前无人机任务规划存在的智能性和安全性问题,提出一种基于安全强化学习算法的无人机智能规划方法(SAC-Lagrangian)。考虑了雷达威胁、禁飞区安全约束和地导对抗条件,将任务规划问题建模为约束马尔可夫决策过程(CMDP),通过拉格朗日乘子法变为对偶问题,采用最大熵柔性行动者-评论家(SAC)算法近似求解最优策略,保证了智能体在遵守安全约束条件下最大化期望回报。仿真结果表明,与其他基线算法相比,所提方法能在保证任务性能的同时确保安全性,适应动态变化的场景,任务完成率达到96%,因此,具有高效、鲁棒和安全的优势。Due to its advantages of low cost,consumable,distributed deployment,agility and flexibility,UAVs have shown great success in many civil fields.However,due to the limitation of its intelligence,there are still significant challenges in how to autonomously and safely complete tasks under complex adversarial conditions.Aiming at the problems of intelligence and safety in UAV mission planning,based on safe reinforcement learning,a UAV intelligent planning method called SAC-Lagrangian is proposed.Considering the radar threats,no fly zone safety constraints and ground-to-air missile(SAM)countermeasure conditions,the mission planning problem is modeled as a Constrained Markov Decision Process(CMDP),which is transformed into a dual problem through Lagrangian multiplier method.The maximum entropy Soft Actor-Critic(SAC)algorithm is used to approximate the optimal policy,ensuring that the agent can maximize the expected return under the safety constraints.Compared with other baseline algorithms,simulation results show that the proposed method can ensure the safety while ensuring the task performance,adapt to the dynamical changing scenarios,and achieve a task completion rate of 96%.Therefore,the proposed method is efficient,robust and safe.

关 键 词:无人机 安全强化学习 SAC-Lagrangian 智能任务规划 鲁棒性 

分 类 号:V219[航空宇航科学与技术—航空宇航推进理论与工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象