基于分层强化学习的多无人机协同围捕方法  

Multi-UAV collaborative pursuit method via hierarchical reinforcement learning

在线阅读下载全文

作  者:孙懿豪 闫超 相晓嘉[1] 唐邓清 周晗 姜杰[2] SUN Yi-hao;YAN Chao;XIANG Xiao-jia;TANG Deng-qing;ZHOU Han;JIANG Jie(College of Intelligence Science and Technology,National University of Defense Technology,Changsha Hunan 410073,China;China Academy of Launch Vehicle Technology,Beijing 100076,China)

机构地区:[1]国防科技大学智能科学学院,湖南长沙410073 [2]中国运载火箭技术研究院,北京100076

出  处:《控制理论与应用》2025年第1期96-108,共13页Control Theory & Applications

基  金:国家自然科学基金项目(62403240);江苏省自然科学基金项目(BK20241396);湖南省研究生科研创新项目(CX20240114)资助.

摘  要:针对复杂障碍环境下的动态目标围捕问题,本文提出一种基于分层强化学习的多无人机协同围捕方法.该方法包含两个层级的学习过程:底层的子策略学习和高层的子策略切换.具体而言,将协同围捕任务分解为导航避障和导航避碰两个子任务,独立学习相应的底层子策略,分别赋予无人机协同围捕目标时所需的避障与避碰技能.在此基础上,设计带有切换惩罚的稀疏回报函数训练高层的子策略切换模块,避免了对人工定义规则的依赖,实现了底层技能的自动组合.数值仿真与软件在环实验结果表明,所提方法能够显著降低围捕策略的学习难度,相较于基线方法具有最高的围捕成功率.Aiming at the dynamic target pursuit problem in the complex obstacle environment,a multi-UAV collaborative pursuit method via hierarchical reinforcement learning is proposed.This method contains two levels of learning process:the low-level sub-policy learning and the high-level sub-policy switching.Specifically,the collaborative pursuit task is decomposed into two sub-tasks,navigation obstacle avoidance and navigation collision avoidance.The corresponding sub-policies are learned independently to endow the UAV with skills of obstacle avoidance and collision avoidance required for collaborative pursuit.On this basis,a sparse reward function with a switching penalty is designed to train the high-level sub-policy switching module,which avoids the dependence on manually defined rules and realizes the automatic combination of underlying skills.Results of numerical simulation and software-in-the-loop experiments show that the proposed method can significantly reduce the learning difficulty of the pursuit policy,and has the highest success rate of pursuit compared with the baseline methods.

关 键 词:分层强化学习 避障 避碰 多无人机围捕 

分 类 号:TP18[自动化与计算机技术—控制理论与控制工程] V279[自动化与计算机技术—控制科学与工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象