检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:李烨[1] 司轲 Li Ye;Si Ke(School of Optical-Electrical&Computer Engineering,University of Shanghai for Science&Technology,Shanghai 200093,China)
机构地区:[1]上海理工大学光电信息与计算机工程学院,上海200093
出 处:《计算机应用研究》2023年第12期3772-3777,共6页Application Research of Computers
基 金:华为技术有限公司合作资助项目(YBN2019115054)。
摘 要:近年来,深度强化学习作为一种无模型的资源分配方法被用于解决无线网络中的同信道干扰问题。然而,基于常规经验回放策略的网络难以学习到有价值的经验,导致收敛速度较慢;而人工划定探索步长的方式没有考虑算法在每个训练周期上的学习情况,使得对环境的探索存在盲目性,限制了系统频谱效率的提升。对此,提出一种频分多址系统的分布式强化学习功率控制方法,采用优先经验回放策略,鼓励智能体从环境中学习更重要的数据,以加速学习过程;并且设计了一种适用于分布式强化学习、动态调整步长的探索策略,使智能体得以根据自身学习情况探索本地环境,减少人为设定步长带来的盲目性。实验结果表明,相比于现有算法,所提方法加快了收敛速度,提高了移动场景下的同信道干扰抑制能力,在大型网络中具有更高的性能。In recent years,deep reinforcement learning has been used as a model-free resource allocation method to solve the problem of co-channel interference in wireless networks.However,networks based on conventional experience replay strategies are difficult to learn valuable experiences,resulting in slower convergence speed.The manual method of determining the exploration step size does not take into account the learning situation of the algorithm in each training cycle,resulting in blind exploration of the environment and limited improvement of the system spectral efficiency.This paper proposed a distributed reinforcement learning power control method for frequency division multiple access systems,which adopted a priority experience replay strategy to encourage agents to learn more important data from the environment to accelerate the learning process.Moreover,this paper designed an exploration strategy with dynamic adjustment of step size suitable for distributed reinforcement learning.The strategy allowed agents to explore the local environment based on their own learning situation and hence reduced the blindness caused by manually setting step sizes.The experimental results show that compared to existing algorithms,the proposed method accelerates the convergence speed,improves the ability of co-channel interference suppression in mobile scenarios,and gains higher performance in large networks.
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:18.191.178.45