检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:袁泽 赵知劲[1,2] YUAN Ze;ZHAO Zhijin(School of Communication Engineering,Hangzhou Dianzi University,Hangzhou Zhejiang 310018,China;National Key Laboratory of Communication System Information Control Technology,36th Research Institute of China Electronics Technology Group,Jiaxing Zhejiang 314001,China)
机构地区:[1]杭州电子科技大学通信工程学院,浙江杭州310018 [2]中国电子科技集团第36研究所通信系统信息控制技术国家级重点实验室,浙江嘉兴314001
出 处:《杭州电子科技大学学报(自然科学版)》2024年第1期6-13,共8页Journal of Hangzhou Dianzi University:Natural Sciences
基 金:国家自然科学基金资助项目(U19B2016)。
摘 要:为了提高跳频通信系统在复杂电磁环境下的抗干扰性能,提出一种基于结合汤普森采样(Thompson Sampling)、Dyna模型和期望SARSA学习(Expected Sarsa)的智能抗干扰决策算法。在期望SARSA学习中,引入Dyna模型,将模型学习与强化学习结合,提升了算法收敛速度和稳态性能;采用汤普森采样和Tanh函数改进动作选择机制,提高了算法对环境的探索和利用;以时隙对应的干扰环境为状态,以跳频速率、信号瞬时带宽、频率序列等为动作构造状态动作空间,设计了相应的跳频系统模型和奖励函数。在高斯白噪声、窄带干扰、宽带干扰和扫频干扰并存的复杂干扰环境中的仿真结果表明,此算法兼顾了对环境的探索与利用,比对比算法有更快的收敛速度和更强的抗干扰能力。To increase the anti-jamming performance of frequency hopping communication system in complex electromagnetic environment,an intelligent anti-jamming decision-making algorithm based on Thompson sampling,Dyna model and expected SARSA learning is proposed.In the expected SARSA learning,Dyna model is applied,and then the convergence speed and steady performance are improved because the reinforcement learning is combined with the model learning.The action selection strategy is further improved by using Thompson sampling algorithm,and Tanh function,which enhances the method's exploration and utilization of the environment.The interference environment corresponding to the time slot is set as the state,and the frequency hopping rate,signal instantaneous bandwidth,frequency sequence and source power are set as actions for constructing state action space,and finally the corresponding frequency hopping system model and reward function are designed.In the complex interference environment where Gaussian white noise,narrowband interference,broadband interference and frequency sweep interference coexist,the simulation results show that this algorithm can balance the both exploration and utilization of the environment and achieves faster convergence speed and stronger anti-interference ability than the compared algorithms.
关 键 词:复杂电磁环境 跳频系统 期望SARSA学习 汤普森采样 Dyna模型
分 类 号:TN914.41[电子电信—通信与信息系统]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.49