基于深度强化学习的追逃博弈算法被引量：12

Pursuit-Evasion Game Algorithm Based on Deep Reinforcement Learning

作　　者：谭浪巩庆海[1,2] 王会霞 Tan Lang;Gong Qinghai;Wang Huixia(Beijing Aerospace Automatic Control Institute,Beijing 100854,China;National Key Laboratory of Science and Technology on Aerospace Intelligence Control,Beijing 100854,China)

机构地区：[1]北京航天自动控制研究所,北京100854 [2]宇航智能控制技术国家级重点实验室,北京100854

出　　处：《航天控制》2018年第6期3-8,19,共7页Aerospace Control

基　　金：国家自然科学基金(61773341)

摘　　要：在未来的局部战争中,导弹攻防对抗将成为一个重要的作战样式。用智能小车的追逃来模拟导弹攻防对抗过程,并以深度确定性策略梯度(Deep Deterministic Policy Gradient,DDPG)算法为原型,以视距和视线角为状态,借鉴PID控制思想设计回报函数,提出了一种追逃博弈算法。该算法分别在数学仿真和智能小车实物上进行了验证,实验结果表明算法可以有效地控制小车使其完成追捕任务,并且具有很好的适应性。The process of attack-defense interaction for guided missiles will be a much important part in the future local war.imulat the attack-defense interaction of missiles with the pursuit-evasion game of intelligent mini-car,a method for solving the pursuit-evasion game,which is based on the eep eterministic olicy radient (DDPG)lgorithm.The state vectors of this method are the distance and the angular of ine f ight ).The reward function is designed by referencing the method of PID controller.The mathematical simulations and experiments of ursuit-vasion game have been done to prove the method,and the results show that it cannot only effectively control the mini-car to complete its mission of capturing the evader,but also has well adaptability.

关键词：导弹攻防对抗追逃博弈深度强化学习 DDPG

分类号：TP242.6[自动化与计算机技术—检测技术与自动化装置]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于深度强化学习的追逃博弈算法被引量：12

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于深度强化学习的追逃博弈算法 被引量：12

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

基于深度强化学习的追逃博弈算法被引量：12