检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:杨振[1] 李琳[1,2] 柴仕元 黄吉传 朴海音[4] 周德云 Zhen YANG;Lin LI;Shiyuan CHAI;Jichuan HUANG;Haiyin PIAO;Deyun ZHOU(School of Electronics and Information,Northwestern Polytechnical University,Xi’an 710072,China;AVIC Shenyang Aircraft Design and Research Institute,Shenyang 110035,China;The 93147th Unit of Chinese PLA,Chengdu 610091,China;School of Artificial Intelligence,Jilin University,Changchun 130012,China)
机构地区:[1]西北工业大学电子信息学院,西安710072 [2]航空工业沈阳飞机设计研究所,沈阳110035 [3]中国人民解放军93147部队,成都610091 [4]吉林大学人工智能学院,长春130012
出 处:《航空学报》2024年第20期146-166,共21页Acta Aeronautica et Astronautica Sinica
基 金:国家自然科学基金(62006193,62103338);陕西省重点研发计划(2024GX-YBXM-115);航空科学基金(2022Z023053001);中央高校基本科研业务费专项资金(D5000230150)。
摘 要:空战通常是一个连续且包含多回合导弹攻防对抗的过程,UCAV在规避来袭空空导弹的过程中应综合考虑机动对整个空战对抗任务的影响,而不是仅关注安全性因素。对此,提出了脱靶量、耗能以及终端态势优势等多战术需求条件下的UCAV空战自主规避机动方法。建立了UCAV-导弹三维追逃模型以及UCAV自主规避的状态空间、动作空间和奖励函数模型,针对该模型提出了LSTM-Dueling DDQN算法,该算法融合Double DQN和Dueling DQN网络模型,并使用LSTM网络提取时序特征。基于探索课程学习思想,对稠密与稀疏奖励函数进行时序融合,促进人工经验和策略探索对学习过程的共同引导。此外,引入切比雪夫方法求解面向不同战术需求偏重程度的Pareto策略解集,以反映多种战术需求的矛盾性与耦合性。仿真实验与结果分析表明:所提方法具有良好的收敛速度和学习效果,对解决多战术需求条件下空战自主规避机动问题的可行性与有效性显著,所得的规避机动策略能够在保证UCAV自身安全性的同时反应出不同的规避战术需求。Air combat is usually a continuous process involving multiple rounds of missile confrontation.Unmanned Combat Aerial Vehicle(UCAV)should comprehensively consider the impact of maneuvering on the entire air combat mission in the process of evading incoming air-to-air missiles,instead of focusing only on safety factors.In this paper,a UCAV autonomous evasive maneuver method is proposed under the condition of multi-tactical requirements such as miss distance,energy consumption and terminal superiority.A three-dimensional pursuit and escape model of UCAV-missile and a model for the state space,action space and reward function of UCAV autonomous evasion are established.An algorithm based on the LSTM-Dueling DDQN(Long Short-Term Memory-Dueling Double Deep-Q Network)is proposed for this model.The algorithm fuses Double DQN and Dueling DQN network models,and uses LSTM network to extract timing features.Based on the concept of exploratory course learning,temporal fusion of dense and sparse reward functions is carried out to promote joint guidance of artificial experience and strategy exploration in the process of maneuver learning.The Chebyshev method is introduced to solve the Pareto solution set for different degree of tactical demands,so as to reflect the contradiction and coupling of multiple tactical requirements.Simulation experiments and result analysis show that the proposed method has good convergence speed and learning effect,and is feasible and effective to solve the problem of autonomous evasive maneuver in air combat under multiple tactical requirements.The obtained evasive maneuvers can reflect different evasive tactical requirements while ensuring UCAV’s own safety.
关 键 词:空战机动 自主规避 战术需求 UCAV LSTM-Dueling DDQN
分 类 号:V249[航空宇航科学与技术—飞行器设计]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.139.94.189