基于阶段诱导学习的多无人艇协同目标围捕策略

Stage-induced learning-based cooperative target hunting strategy for multiple unmanned surface vehicles

作　　者：曲星儒江雨泽龙飞飞张汝波高颖 QU Xingru;JIANG Yuze;LONG Feifei;ZHANG Rubo;GAO Ying(College of Mechanical and Electrical Engineering,Dalian Minzu University,Dalian 116600,China)

机构地区：[1]大连民族大学机电工程学院,辽宁大连116600

出　　处：《中国舰船研究》2025年第1期162-171,共10页Chinese Journal of Ship Research

基　　金：国家自然科学基金资助项目(61673084);中央高校基本科研业务费资助项目(04442024046)。

摘　　要：[目的]针对海上目标无人艇智能逃逸问题,提出一种基于阶段诱导学习的多无人艇协同目标围捕策略。[方法]首先构建针对无人艇围捕逃逸的马尔科夫博弈模型,明确基于距离和角度的围捕成功判定条件。为提升智能逃逸下多无人艇的目标围捕性能,采用集中式训练-分布式执行框架和长短时记忆网络相结合的方法,基于多智能体柔性行动-评判(MASAC)算法开展协同围捕训练。同时,设计基于阶段诱导的协同围捕奖励机制,依据双方当前状态来优化训练进程,避免“惰性围捕艇”现象,提高围捕成功率,引导无人艇由易到难地完成围捕任务。[结果]仿真结果表明,与仅采用阶段诱导奖励的MASAC、仅采用长短时记忆网络的MASAC和MASAC围捕策略相比,所提策略的围捕成功率分别提高3.3%,6.1%和24.4%,验证了其可行性和有效性。[结论]所提策略为无人艇攻防对抗提供了有价值的技术参考,有助于推动无人艇技术在相关领域的应用与发展。[Objectives]Aiming at the intelligent escape problem of marine target unmanned surface vehicles(USVs),this paper proposes an enhanced cooperative target hunting strategy which is designed to improve the performance of multi-USV systems in capturing escaping targets through a combination of advanced learning and optimization techniques.[Method]First,a Markov game process for USV hunting and escaping scenarios is established with the success criteria defined using distance and angle metrics.To enhance hunting performance against intelligent escapes,a training framework is developed using the centralized training and decentralized execution paradigm and long short-term memory(LSTM)networks.This is integrated with the multi-agent soft actor-critic(MASAC)algorithm for cooperative hunting training.Additionally,a stage-induced cooperative hunting reward method is introduced.The proposed method optimizes the training process based on the current states of both the hunter and the target,guiding the USVs to achieve hunting tasks progressively from easier to more challenging stages.It also mitigates the issue of"inertia hunting vehicles"and increases the hunting success rate.[Results]The simulation results,particularly in a 3-USV versus 1-target scenario,validate the feasibility and effectiveness of the proposed strategy.Compared to existing methods such as MASAC with only stage-induced rewards,MASAC with only LSTM,and basic MASAC,the proposed strategy shows significant improvements in hunting success rate,with increases of 3.3%,6.1%,and 24.4%respectively.[Conclusions]The proposed stage-induced cooperative target hunting strategy offers valuable technical insights for the development of offensive and defensive strategies in USV operations,enhancing the capabilities of multi-USV systems in complex marine scenarios.

关键词：无人艇协同目标围捕多智能体柔性行动-评判阶段诱导奖励

分类号：U664.82[交通运输工程—船舶及航道工程]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于阶段诱导学习的多无人艇协同目标围捕策略

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于阶段诱导学习的多无人艇协同目标围捕策略

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索