基于深度强化学习算法的无人机智能规避决策

UAV intelligent avoidance decisions based on deep reinforcement learning algorithm

作　　者：吴冯国陶伟[2] 李辉[1,3] 张建伟[1,3] 郑成辰 WU Fengguo;TAO Wei;LI Hui;ZHANG Jianwei;ZHENG Chengchen(National Key Laboratory of Fundamental Science on Synthetic Vision,Sichuan University,Chengdu 610065,China;China Ship Development and Design Center,Wuhan 430064,China;School of Computer Science,Sichuan University,Chengdu 610065,China)

机构地区：[1]四川大学视觉合成图形图像技术国防重点学科实验室,四川成都610065 [2]中国舰船研究设计中心,湖北武汉430064 [3]四川大学计算机学院,四川成都610065

出　　处：《系统工程与电子技术》2023年第6期1702-1711,共10页Systems Engineering and Electronics

基　　金：“十三五”全军共用信息系统装备预研项目(31505550302)资助课题。

摘　　要：为提升无人机在复杂空战场景中的存活率,基于公开无人机空战博弈仿真平台,使用强化学习方法生成机动策略,以深度双Q网络(double deep Q-network, DDQN)和深度确定性策略梯度(deep deterministic policy gradient, DDPG)算法为基础,提出单元状态序列(unit state sequence, USS),并采用门控循环单元(gated recurrent unit, GRU)融合USS中的态势特征,增加复杂空战场景下的状态特征识别能力和算法收敛能力。实验结果表明,智能体在面对采用标准比例导引算法的导弹攻击时,取得了98%的规避导弹存活率,使无人机在多发导弹同时攻击的复杂场景中,也能够取得88%的存活率,对比传统的简单机动模式,无人机的存活率大幅提高。In order to improve the survival rate of unmanned aerial vehicles(UAVs)in complex air combat scenarios,based on the open UAVs air intelligence game simulation platform,a reinforcement learning method is used to generate maneuver strategies.Based on the deep double Q network(DDQN)and deep deterministic policy gradient(DDPG)algorithms,an unit state sequence(USS)is proposed in this paper,and the gated recurrent unit(GRU)is used to fuse the situation features in USS,with the propose to increase the ability of state features recognition and algorithm convergence in complex air combat scenarios.The experimental results show that when faced with missile attacks using standard proportional guidance algorithm,the agent achieves a survival rate of 98%for missiles evading,and in complex scenarios where multiple missiles attack simultaneously,it can also achieve a survival rate of 88%.Compared with the traditional simple maneuvering modes,the survival rate of UAVs is significantly improved.

关键词：深度强化学习无人机单元状态序列门控循环单元

分类号：E926[军事—军事装备学] TP181[兵器科学与技术—武器系统与运用工程] V211[自动化与计算机技术—控制理论与控制工程]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于深度强化学习算法的无人机智能规避决策

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于深度强化学习算法的无人机智能规避决策

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索