基于分层强化学习的多无人机协同围捕方法

Multi-UAV collaborative pursuit method via hierarchical reinforcement learning

作　　者：孙懿豪闫超相晓嘉[1] 唐邓清周晗姜杰[2] SUN Yi-hao;YAN Chao;XIANG Xiao-jia;TANG Deng-qing;ZHOU Han;JIANG Jie(College of Intelligence Science and Technology,National University of Defense Technology,Changsha Hunan 410073,China;China Academy of Launch Vehicle Technology,Beijing 100076,China)

机构地区：[1]国防科技大学智能科学学院,湖南长沙410073 [2]中国运载火箭技术研究院,北京100076

出　　处：《控制理论与应用》2025年第1期96-108,共13页Control Theory & Applications

基　　金：国家自然科学基金项目(62403240);江苏省自然科学基金项目(BK20241396);湖南省研究生科研创新项目(CX20240114)资助.

摘　　要：针对复杂障碍环境下的动态目标围捕问题,本文提出一种基于分层强化学习的多无人机协同围捕方法.该方法包含两个层级的学习过程:底层的子策略学习和高层的子策略切换.具体而言,将协同围捕任务分解为导航避障和导航避碰两个子任务,独立学习相应的底层子策略,分别赋予无人机协同围捕目标时所需的避障与避碰技能.在此基础上,设计带有切换惩罚的稀疏回报函数训练高层的子策略切换模块,避免了对人工定义规则的依赖,实现了底层技能的自动组合.数值仿真与软件在环实验结果表明,所提方法能够显著降低围捕策略的学习难度,相较于基线方法具有最高的围捕成功率.Aiming at the dynamic target pursuit problem in the complex obstacle environment,a multi-UAV collaborative pursuit method via hierarchical reinforcement learning is proposed.This method contains two levels of learning process:the low-level sub-policy learning and the high-level sub-policy switching.Specifically,the collaborative pursuit task is decomposed into two sub-tasks,navigation obstacle avoidance and navigation collision avoidance.The corresponding sub-policies are learned independently to endow the UAV with skills of obstacle avoidance and collision avoidance required for collaborative pursuit.On this basis,a sparse reward function with a switching penalty is designed to train the high-level sub-policy switching module,which avoids the dependence on manually defined rules and realizes the automatic combination of underlying skills.Results of numerical simulation and software-in-the-loop experiments show that the proposed method can significantly reduce the learning difficulty of the pursuit policy,and has the highest success rate of pursuit compared with the baseline methods.

关键词：分层强化学习避障避碰多无人机围捕

分类号：TP18[自动化与计算机技术—控制理论与控制工程] V279[自动化与计算机技术—控制科学与工程]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于分层强化学习的多无人机协同围捕方法

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于分层强化学习的多无人机协同围捕方法

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索