检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:黄湘松[1,2] 查力根 潘大鹏 HUANG Xiangsong;CHA Ligen;PAN Dapeng(College of Information and Communication Engineering,Harbin Engineering University,Harbin 150001,China;Key Laboratory of Advanced Marine Communication and Information Technology,Harbin Engineering University,Harbin 150001,China)
机构地区:[1]哈尔滨工程大学信息与通信工程学院,黑龙江哈尔滨150001 [2]哈尔滨工程大学先进船舶通信与信息技术工业和信息化部重点实验室,黑龙江哈尔滨150001
出 处:《应用科技》2024年第4期145-153,共9页Applied Science and Technology
基 金:航空科学基金项目(201801P6003);中央高校基本科研业务费专项资金项目(3072022CF0802).
摘 要:针对传统深度Q网络(deep Q network,DQN)在雷达认知干扰决策中容易产生经验遗忘,从而重复执行错误决策的问题,本文提出了一种基于威胁机制双重深度Q网络(threat warning mechanism-double DQN,TW-DDQN)的认知干扰决策方法,该机制包含威胁网络和经验回放2种机制。为了验证算法的有效性,在考虑多功能雷达(multifunctional radar,MFR)工作状态与干扰样式之间的关联性的前提下,搭建了基于认知电子战的仿真环境,分析了雷达与干扰机之间的对抗博弈过程,并且在使用TW-DDQN进行训练的过程中,讨论了威胁半径与威胁步长参数的不同对训练过程的影响。仿真实验结果表明,干扰机通过自主学习成功与雷达进行了长时间的博弈,有80%的概率成功突防,训练效果明显优于传统DQN和优先经验回放DDQN(prioritized experience replay-DDQN,PER-DDQN)。To address the issue of traditional deep Q network(DQN)in radar cognitive jamming decision making,where it is prone to experience forgetting and thus repeat erroneous decisions,this paper introduces a cognitive jamming decision method based on threat warning mechanism-double DQN(TW-DDQN).This mechanism includes two systems of treat network and experience replay.To verify effectiveness of the algorithm,a simulation environment based on cognitive electronic warfare was constructed,considering the correlation between multifunctional radar(MFR)operational states and jamming styles.The adversarial process between radar and jammer was analyzed.During the training process of TW-DDQN,the impact of different threat radius and threat step parameters on the training was discussed.Simulation results show that the jammer,through autonomous learning,has successfully engaged in prolonged adversarial interactions with the radar,achieving an 80%probability of successful penetration.The training effect is significantly better than that of traditional DQN and prioritized experience replay-DDQN(PER-DDQN)that only considers prioritized experience replay.
关 键 词:干扰决策 认知电子战 深度Q网络 强化学习 干扰机 多功能雷达 经验回放 恒虚警率探测
分 类 号:TN974[电子电信—信号与信息处理]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.144.206.193