检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:宋祺 左家亮[1] 张滢 闫孟达 吴傲 李乐言 SONG Qi;ZUO Jialiang;ZHANG Ying;YAN Mengda;WU Ao;LI Leyan(Air Traffic Control and Navigation School,Air Force Engineering University,Xi’an 710051,China)
出 处:《兵器装备工程学报》2025年第2期145-156,共12页Journal of Ordnance Equipment Engineering
摘 要:现有超视距空战智能决策研究多侧重于机动决策,而战术决策研究较少。针对机动决策难理解、战术决策难生成的问题,提出了一种融合深度学习(DL)和蒙特卡洛搜索(MCTS)的算法,通过构建空战智能体自主学习和决策框架,融合智能体的离线战术学习和在线战术决策,实现了一种基于DL-MCTS的超视距空战战术决策方法。在离线学习阶段,利用神经网络学习先验战术规划数据集,包含感知数据集、策略数据集和评估数据集,并为智能体构建感知器、规划器和评估器3种功能模块。在实时对抗阶段,提出战术感知和决策双线并行处理模式,建立对抗博弈树。利用蒙特卡洛搜索方法融合智能体3种网络,在每个博弈节点上实现选择、扩展、仿真和信息回溯,实时搜索当前态势的最优策略。在迎头攻击任务实验中,离线训练后的智能体具备基本的决策能力,经过50次循环迭代搜索后,智能体能够消除对手的首发导弹优势,并逐步获取自身导弹发射条件。实验结果表明该战术决策方法的决策结果可解释性强、决策速度较满意。Existing intelligent decision-making research in beyond-visual-range(BVR)air combat mostly focuses on maneuvering decision-making,while there is less research on tactical decision-making.To address the issues of difficult maneuvering decision-making comprehension and challenging tactical decision-making generation,an algorithm integrating deep learning(DL)and Monte Carlo Tree Search(MCTS)is proposed.By constructing an autonomous learning and decision-making framework for air combat agents,integrating the agents’offline tactical learning and online tactical decision-making,a BVR air combat tactical decision-making method based on DL-MCTS is realized.In the offline stage,historical engagement data and tactical theoretical knowledge are used to build a tactical database,including perception data sets,decision-making data sets,and evaluation data sets.Moreover,three functional modules of perceptron,planner and evaluator for the agent is constructed and trained with deep neural networks based on the data-base.In the real-time confrontation stage,two parallel modes are designed for perception and decision-making timeline independently.The Monte Carlo search method is introduced to fuse the three networks of the agent to realize selection,expansion,simulation and information backtracking at each node.The optimal strategies are searched and updated with DL-MCTS in real-time.Finally,experiments show that the agent after offline training has basic decision-making capabilities.In a head-on attack mission,after 50 cycles iterative search,the agent can eliminate the adversary’s first missile advantage and gradually acquire its own missile launch conditions.The experimental results demonstrate that the decision-making outcomes of this tactical decision-making method exhibit strong interpretability,and the decision-making speed is satisfactory.
关 键 词:超视距空战 战术决策 智能决策 深度学习 蒙特卡洛树搜索
分 类 号:V323[航空宇航科学与技术—人机与环境工程] E926[军事—军事装备学]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:18.224.2.133