检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:周彬[1] 尚熙 刘枫[1] 苏中华 ZHOU Bin;SHANG Xi;LIU Feng;SU Zhonghua(Southwest China Research Institute of Electronic Equipment,Chengdu 610036,China)
机构地区:[1]中国电子科技集团公司第二十九研究所,成都610036
出 处:《电子信息对抗技术》2025年第2期15-23,共9页Electronic Information Warfare Technology
摘 要:针对飞行器携带干扰模块在复杂电磁环境中合理利用航迹规避使自身突防能力最大化的问题,提出了一种基于强化学习算法的飞行器轨迹防护及干扰策略生成方法。电磁对抗背景选取多部S、C波段雷达,计算回波信号经过抗干扰模块处理后信噪比,并嵌套SwerlingII及概率准则模型研究一定虚警下飞行器轨迹防护及干扰策略分配问题。选取基于马尔科夫链的Sarsa、深度Q网络(Deep Q-Network,DQN)、Dueling-DQN算法,引入航迹评价与干扰效果评价组成的目标函数进行优化。通过比较其航迹规划效果、雷达进入跟踪模式概率、飞行器动作分配结果,证实了采用强化学习算法的飞行器可以在与环境的认知过程中,通过自身航迹规划及干扰策略生成避免雷达进入制导模式,与固定航迹下干扰资源分配相比有效提升了飞行器的自身防护能力。最后,比较雷达进入跟踪模式概率,可以发现DQN算法要优于其他两种算法。A reinforcement learning algorithm based method for aircraft trajectory protection and jamming strategy generation is proposed to address the problem of maximizing the penetration ability of aircraft carrying jamming modules by utilizing trajectory avoidance in complex electromagnetic environments.Multiple S-band and C-band radars are selected for electromagnetic countermeasures,and the signal-to-noise ratio of the echo signal is calculated after being processed by the anti-jamming module.SwerlingII and probability criterion models are nested to study the trajectory protection and interference strategy allocation of aircraft under certain false alarms.Sarsa,deep Q-network(DQN),and Dueling DQN algorithms are selected based on Markov chains,and an objective function is introduced composed of trajectory evaluation and jamming effect evaluation for optimization.By comparing the trajectory planning effect,radar entering tracking mode probability,and aircraft action allocation results,it has been confirmed that the aircraft using reinforcement learning algorithm can avoid radar entering guidance mode through its own trajectory planning and jamming strategy generation during the cognitive process with the environment.Compared with the allocation of jamming resources under fixed trajectory,it effectively improves the aircraft s self-protection ability.Finally,comparing the probability of radar entering tracking mode,it can be found that the DQN algorithm is superior to the other two algorithms.
关 键 词:强化学习 多功能雷达 飞行器轨迹防护 干扰策略 马尔科夫链
分 类 号:TN974[电子电信—信号与信息处理]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.33