基于深度强化学习的追逃博弈算法  被引量:12

Pursuit-Evasion Game Algorithm Based on Deep Reinforcement Learning

在线阅读下载全文

作  者:谭浪 巩庆海[1,2] 王会霞 Tan Lang;Gong Qinghai;Wang Huixia(Beijing Aerospace Automatic Control Institute,Beijing 100854,China;National Key Laboratory of Science and Technology on Aerospace Intelligence Control,Beijing 100854,China)

机构地区:[1]北京航天自动控制研究所,北京100854 [2]宇航智能控制技术国家级重点实验室,北京100854

出  处:《航天控制》2018年第6期3-8,19,共7页Aerospace Control

基  金:国家自然科学基金(61773341)

摘  要:在未来的局部战争中,导弹攻防对抗将成为一个重要的作战样式。用智能小车的追逃来模拟导弹攻防对抗过程,并以深度确定性策略梯度(Deep Deterministic Policy Gradient,DDPG)算法为原型,以视距和视线角为状态,借鉴PID控制思想设计回报函数,提出了一种追逃博弈算法。该算法分别在数学仿真和智能小车实物上进行了验证,实验结果表明算法可以有效地控制小车使其完成追捕任务,并且具有很好的适应性。The process of attack-defense interaction for guided missiles will be a much important part in the future local war.imulat the attack-defense interaction of missiles with the pursuit-evasion game of intelligent mini-car,a method for solving the pursuit-evasion game,which is based on the eep eterministic olicy radient (DDPG)lgorithm.The state vectors of this method are the distance and the angular of ine f ight ).The reward function is designed by referencing the method of PID controller.The mathematical simulations and experiments of ursuit-vasion game have been done to prove the method,and the results show that it cannot only effectively control the mini-car to complete its mission of capturing the evader,but also has well adaptability.

关 键 词:导弹攻防对抗 追逃博弈 深度强化学习 DDPG 

分 类 号:TP242.6[自动化与计算机技术—检测技术与自动化装置]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象