检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:赵浩钦 段国栋 司江勃[1] 黄睿 石嘉 ZHAO Haoqin;DUAN Guodong;SI Jiangbo;HUANG Rui;SHI Jia(Institute of Communications Engineering,Xidian University,Xi’an 710071,China;Southwest China Research Institute of Electronic Equipment,Chengdu 610036,China)
机构地区:[1]西安电子科技大学通信工程学院,西安710071 [2]中国电子科技集团公司第二十九研究所,成都610036
出 处:《电子与信息学报》2024年第7期2694-2702,共9页Journal of Electronics & Information Technology
基 金:电磁空间作战与应用重点实验室基金(JJ2021-001);国家自然科学基金(62425103)。
摘 要:针对电磁对抗过程中环境动态变化,多节点自主用频决策频谱利用率低的问题,该文开展面向非完全电磁信息的智能协同频谱分配技术研究,通过多节点智能协同提升频谱利用率。首先将复杂电磁环境频谱分配问题建模为最大化用频设备的优化问题,其次提出一种基于多节点协同分流经验回放机制的资源决策算法(CoDDQN),算法基于协同分流函数对历史经验数据进行评估,并通过分级经验池进行训练,使各智能体在仅观测自身状态信息条件下形成轻量级协同决策能力,解决低视度条件下多节点决策优化方向与整体优化目标不一致的问题,提升频谱利用率;设计了一种基于置信分配的混合奖励函数,各节点决策兼顾个体的奖励,能够减少惰性节点的出现,探索更优的整体动作策略,进一步提升系统效益。仿真结果表明:当节点数为20时,所提算法的可接入设备数优于全局贪婪算法与遗传算法,并与信息完全共享的集中式频谱分配算法的差距在5%内,更适用于低视度节点的协同频谱分配。To solve the problem of low spectrum utilization of multi-node autonomous frequency decision-making in the dynamic electromagnetic countermeasure environment,the research on intelligent cooperative spectrum allocation technology for in complete electromagnetic information is carried out,which improves spectrum utilization through multi-node intelligent collaboration.Firstly,the spectrum allocation problem is modelled as an optimization problem to maximize the frequency-using equipment,and secondly,a resource decision-making algorithm based on the multi-node cooperative diversion experience repetition mechanism(Cooperation-Deep double Q-network,Co-DDQN)is proposed.This algorithm evaluates the historical experience data based on the cooperative diversion function and is trained by a hierarchical experience pool,so that each agent can form a lightweight cooperative decision-making ability under self-observation,and solve the problem of inconsistency between the optimization direction of multi-node decision-making and the overall optimization goal under low-visibility conditions.Besides,a hybrid reward function based on confidence allocation is designed,and each node considers itself when the decision is made,which can reduce the emergence of lazy nodes,explore a better overall action strategy,and further improve the system efficiency.Simulation results show that when the number of nodes is 20,the number of accessible devices of the proposed algorithm outperforms the global greedy algorithm and the genetic algorithm,and the difference with the centralized spectrum allocation algorithm with complete information sharing is within 5%,which is more suitable for cooperative spectrum allocation of low-visibility nodes.
关 键 词:频谱资源分配 深度强化学习 非完全电磁信息 协同分流机制
分 类 号:TN97[电子电信—信号与信息处理]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:18.226.251.231