检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:岳龙飞 杨任农[2] 闫孟达 赵小茹[2] 左家亮[2] 刘会亮 张明元[1] YUE Longfei;YANG Rennong;YAN Mengda;ZHAO Xiaoru;ZUO Jialiang;LIU Huiliang;ZHANG Mingyuan(National Key Laboratory of Electromagnetic,Naval University of Engineering,Wuhan 430000,China;Air Traffic Control and Navigation College,Air Force Engineering University,Xi'an 710000,China;Xi'an Satellite Control Center,Xi'an 710000,China)
机构地区:[1]海军工程大学电磁能技术全国重点实验室,武汉430000 [2]空军工程大学空管领航学院,西安710000 [3]西安卫星测控中心,西安710000
出 处:《电光与控制》2024年第8期1-7,共7页Electronics Optics & Control
基 金:国家自然科学基金(62106284);陕西省自然科学基金(2021 JQ-370)。
摘 要:无人机因其低成本、可消耗、分布部署、敏捷灵活的优势,在多个民用领域大放异彩。但受其智能化程度限制,如何在复杂对抗条件下自主安全完成任务仍存在巨大挑战。针对目前无人机任务规划存在的智能性和安全性问题,提出一种基于安全强化学习算法的无人机智能规划方法(SAC-Lagrangian)。考虑了雷达威胁、禁飞区安全约束和地导对抗条件,将任务规划问题建模为约束马尔可夫决策过程(CMDP),通过拉格朗日乘子法变为对偶问题,采用最大熵柔性行动者-评论家(SAC)算法近似求解最优策略,保证了智能体在遵守安全约束条件下最大化期望回报。仿真结果表明,与其他基线算法相比,所提方法能在保证任务性能的同时确保安全性,适应动态变化的场景,任务完成率达到96%,因此,具有高效、鲁棒和安全的优势。Due to its advantages of low cost,consumable,distributed deployment,agility and flexibility,UAVs have shown great success in many civil fields.However,due to the limitation of its intelligence,there are still significant challenges in how to autonomously and safely complete tasks under complex adversarial conditions.Aiming at the problems of intelligence and safety in UAV mission planning,based on safe reinforcement learning,a UAV intelligent planning method called SAC-Lagrangian is proposed.Considering the radar threats,no fly zone safety constraints and ground-to-air missile(SAM)countermeasure conditions,the mission planning problem is modeled as a Constrained Markov Decision Process(CMDP),which is transformed into a dual problem through Lagrangian multiplier method.The maximum entropy Soft Actor-Critic(SAC)algorithm is used to approximate the optimal policy,ensuring that the agent can maximize the expected return under the safety constraints.Compared with other baseline algorithms,simulation results show that the proposed method can ensure the safety while ensuring the task performance,adapt to the dynamical changing scenarios,and achieve a task completion rate of 96%.Therefore,the proposed method is efficient,robust and safe.
关 键 词:无人机 安全强化学习 SAC-Lagrangian 智能任务规划 鲁棒性
分 类 号:V219[航空宇航科学与技术—航空宇航推进理论与工程]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.147.48.161