基于多智能体模糊深度强化学习的跳频组网智能抗干扰决策算法被引量：7

Intelligent Anti-jamming Decision Algorithm for Frequency Hopping Network Based on Multi-agent Fuzzy Deep Reinforcemnet Learning

作　　者：赵知劲[1,2] 朱家晟叶学义[2] 尚俊娜[2] ZHAO Zhijin;ZHU Jiasheng;YE Xueyi;SHANG Junna(State Key Laboratory of Information Control Technology in Communication System of No.36 Research Insti-tute,China Electronic Technology Corporation,Jiaxing 314001,China;School of Communication Engineering,Hangzhou Dianzi University,Hangzhou 310018,China)

机构地区：[1]中国电子科技集团第36研究所通信系统信息控制技术国家级重点实验室,嘉兴314001 [2]杭州电子科技大学通信工程学院,杭州310018

出　　处：《电子与信息学报》2022年第8期2814-2823,共10页Journal of Electronics & Information Technology

基　　金：国家自然科学基金(U19B2016)。

摘　　要：为提高复杂电磁环境下跳频异步组网的抗干扰性能,该文提出一种基于集中式训练和分散式执行框架的多智能体模糊深度强化学习(MFDRL-CTDE)算法。针对多种干扰并存的复杂电磁环境和异步组网结构,设计了相应的状态-动作空间和奖赏函数。为应对智能体之间的相互影响和动态的环境,引入集中式训练和分散式执行(CTDE)框架。该文提出基于模糊推理系统的融合权重分配策略,用于解决网络融合过程中各智能体的权重分配问题。采用竞争性深度Q网络算法和优先经验回放技术以提高算法的效率。仿真结果表明,该算法在收敛速度和最佳性能方面都具有较大优势,且对多变复杂电磁环境具有较好的适应性。In order to improve the anti-jamming performance of frequency hopping asynchronous network in complex electromagnetic environment,a Multi-agent Fuzzy Deep Reinforcement Learning algorithm based on Centralized Training and Decentralized Execution(MFDRL-CTDE)is proposed.Considering the complex electromagnetic environment with multiple interferences and the asynchronous network structure,the corresponding state-action space and reward function are designed.For dealing with the interaction between agents and the dynamic environment,the framework of Centralized Training and Decentralized Execution(CTDE)is introduced.Then,a fusion weight allocation strategy based on fuzzy inference system is proposed to solve the weight allocation problem in the process of network fusion.And the Dueling DQN algorithm and the prioritized experience replay technology are used to improve the efficiency of the algorithm.The simulation results show that the algorithm has a great advantage in convergence speed and best performance,and has good adaptability to the changeable complex electromagnetic environment.

关键词：异步组网多智能体深度强化学习集中式学习和分散式执行模糊推理系统

分类号：TN914[电子电信—通信与信息系统] TN973[电子电信—信息与通信工程]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于多智能体模糊深度强化学习的跳频组网智能抗干扰决策算法被引量：7

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于多智能体模糊深度强化学习的跳频组网智能抗干扰决策算法 被引量：7

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

基于多智能体模糊深度强化学习的跳频组网智能抗干扰决策算法被引量：7