基于威胁机制-双重深度Q网络的多功能雷达认知干扰决策  

Cognitive jamming decision making method of multi-functional radar based on DDQN with threat warning mechanism

在线阅读下载全文

作  者:黄湘松[1,2] 查力根 潘大鹏 HUANG Xiangsong;CHA Ligen;PAN Dapeng(College of Information and Communication Engineering,Harbin Engineering University,Harbin 150001,China;Key Laboratory of Advanced Marine Communication and Information Technology,Harbin Engineering University,Harbin 150001,China)

机构地区:[1]哈尔滨工程大学信息与通信工程学院,黑龙江哈尔滨150001 [2]哈尔滨工程大学先进船舶通信与信息技术工业和信息化部重点实验室,黑龙江哈尔滨150001

出  处:《应用科技》2024年第4期145-153,共9页Applied Science and Technology

基  金:航空科学基金项目(201801P6003);中央高校基本科研业务费专项资金项目(3072022CF0802).

摘  要:针对传统深度Q网络(deep Q network,DQN)在雷达认知干扰决策中容易产生经验遗忘,从而重复执行错误决策的问题,本文提出了一种基于威胁机制双重深度Q网络(threat warning mechanism-double DQN,TW-DDQN)的认知干扰决策方法,该机制包含威胁网络和经验回放2种机制。为了验证算法的有效性,在考虑多功能雷达(multifunctional radar,MFR)工作状态与干扰样式之间的关联性的前提下,搭建了基于认知电子战的仿真环境,分析了雷达与干扰机之间的对抗博弈过程,并且在使用TW-DDQN进行训练的过程中,讨论了威胁半径与威胁步长参数的不同对训练过程的影响。仿真实验结果表明,干扰机通过自主学习成功与雷达进行了长时间的博弈,有80%的概率成功突防,训练效果明显优于传统DQN和优先经验回放DDQN(prioritized experience replay-DDQN,PER-DDQN)。To address the issue of traditional deep Q network(DQN)in radar cognitive jamming decision making,where it is prone to experience forgetting and thus repeat erroneous decisions,this paper introduces a cognitive jamming decision method based on threat warning mechanism-double DQN(TW-DDQN).This mechanism includes two systems of treat network and experience replay.To verify effectiveness of the algorithm,a simulation environment based on cognitive electronic warfare was constructed,considering the correlation between multifunctional radar(MFR)operational states and jamming styles.The adversarial process between radar and jammer was analyzed.During the training process of TW-DDQN,the impact of different threat radius and threat step parameters on the training was discussed.Simulation results show that the jammer,through autonomous learning,has successfully engaged in prolonged adversarial interactions with the radar,achieving an 80%probability of successful penetration.The training effect is significantly better than that of traditional DQN and prioritized experience replay-DDQN(PER-DDQN)that only considers prioritized experience replay.

关 键 词:干扰决策 认知电子战 深度Q网络 强化学习 干扰机 多功能雷达 经验回放 恒虚警率探测 

分 类 号:TN974[电子电信—信号与信息处理]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象