高速飞行器追逃博弈决策技术  被引量:7

Pursuit-Evasion Game Decision Technology of High Speed Vehicles

在线阅读下载全文

作  者:崔雅萌 王会霞[1,2] 郑春胜 胡瑞光 CUI Ya-Meng;WANG Hui-Xia;ZHENG Chun-Sheng;HU Rui-Guang(Beijing Aerospace Automatic Control Institute,Beijing 100854,China;National Key Laboratory of Science and Technology on Aerospace Intelligence Control,Beijing 100854,China)

机构地区:[1]北京航天自动控制研究所,北京100854 [2]宇航智能控制技术国家级重点实验室,北京100854

出  处:《指挥与控制学报》2021年第4期403-414,共12页Journal of Command and Control

基  金:国防基础科研基金(JCKY2019203C029)资助。

摘  要:红方单个飞行器面对蓝方多个拦截器时,受到飞行器个体能力以及突防手段的限制,难以突防成功.针对此问题,基于强化学习设计了一种可躲避蓝方多个拦截器的智能博弈策略.建立红方飞行器的数学模型,包含抛撒诱饵、机动调整、姿态调整等行为;建立蓝方飞行器数学模型,蓝方采用比例导引法拦截红方;设计了基于深度确定性策略梯度的飞行器逃逸算法,为了加快智能体的学习速度和算法收敛速度,用先验知识进行预训练和优先经验回放机制相结合的方式进行算法训练.仿真结果表明该算法可使红方飞行器面对蓝方多个拦截器时成功逃逸.Due to the limitation of individual capability and penetration means,a single red aircraft is difficult to penetrate successfully from multiple blue interceptors.To solve this problem,an intelligent game strategy based on reinforcement learning is designed.Firstly,the mathematical model of the red aircraft is established,including throwing bait,maneuvering adjustment,attitude adjustment,etc.Secondly,the mathematical model of the blue aircraft is established,and the blue intercepts the red with the proportional guidance method.Next,a flight escape algorithm based on deep deterministic policy gradient is designed.Then,by combining the prior knowledge pre-training and the preferred experience playback mechanism,the learning speed and the convergence speed of the algorithm are accelerated.The simulation results show that the proposed algorithm can make the red aircraft penetrate successfully when facing the blue interceptors.

关 键 词:飞行器攻防对抗 行为模型 目标分配 深度强化学习 先验知识 

分 类 号:O225[理学—运筹学与控制论] E91[理学—数学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象