基于深度强化学习的舰船导弹目标分配方法被引量：3

Missile-target assignment method of naval ship based on deep reinforcement learning

作　　者：肖友刚[1] 金升成毛晓伍国华陆志沣 XIAO You-gang;JIN Sheng-cheng;MAO Xiao;WU Guo-hua;LU Zhi-feng(School of Traffic&Transportation Engineering,Central South University,Changsha Hunan 410018,China;Shanghai Academy of Spaceflight Technology,Shanghai 201109,China)

机构地区：[1]中南大学交通运输工程学院,湖南长沙410018 [2]上海机电工程研究所,上海201109

出　　处：《控制理论与应用》2024年第6期990-998,共9页Control Theory & Applications

摘　　要：针对对抗环境下的海上舰船防空反导导弹目标分配问题,本文提出了一种融合注意力机制的深度强化学习算法.首先,构建了舰船多类型导弹目标分配模型,并结合目标多波次拦截特点将问题建模为马尔可夫决策过程.接着,基于编码器–解码器框架搭建强化学习策略网络,融合多头注意力机制对目标进行编码,并在解码中结合整体目标和单个目标编码信息实现舰船可靠的导弹目标分配.最后,对导弹目标分配收益、分配时效以及策略网络训练过程进行了仿真实验.实验结果表明,本文方法能生成高收益的导弹目标分配方案,相较于对比算法的大规模决策计算速度提高10%~94%,同时其策略网络能够快速稳定收敛.To effectively solve the missile-target allocation problem of the naval ship in the case of confrontation,this study proposes a deep reinforcement learning algorithm combining attention mechanism.First,we construct a mathematical model for multi-type missiles of the naval ship and design the Markov decision-making process considering the situation of multi-times target interception.After that,the policy network is constructed based on the encoder-decoder architecture,in which targets are encoded combined with the multi-head attention mechanism and the reasonable missile-target allocation scheme is generated in the decoder according to integrated global and local embedding information.Finally,we conduct simulation experiments are carried out on the profit of missile-target allocation schemes,computation time,and the training process of the policy network.The experimental results show that our algorithm can engender missile-target allocation schemes with higher profit compared to baselines,the computation time in large-scale problems is reduced by 10%∼94%,and it converges fast and stably.

关键词：防空反导导弹目标分配武器目标分配深度强化学习

分类号：TP18[自动化与计算机技术—控制理论与控制工程] TJ762.33[自动化与计算机技术—控制科学与工程]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于深度强化学习的舰船导弹目标分配方法被引量：3

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于深度强化学习的舰船导弹目标分配方法 被引量：3

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

基于深度强化学习的舰船导弹目标分配方法被引量：3