检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
机构地区:[1]School of Electronics and Information Engineering,Beihang University,Beijing 100191,China
出 处:《Science China(Information Sciences)》2025年第2期226-243,共18页中国科学(信息科学)(英文版)
基 金:supported in part by National Key R&D Program of China(Grant No.2022YFB2902002);National Natural Science Foundation of China(Grant No.62271024)。
摘 要:Deep reinforcement learning for resource allocation has been investigated extensively owing to its ability of handling model-free and end-to-end problems.However,its slow convergence and high time complexity during online training hinder its practical use in dynamic wireless systems.To reduce the training complexity,we resort to graph reinforcement learning for leveraging two kinds of relational priors inherent in many wireless communication problems:topology information and permutation properties.To harness the two priors,we first conceive a method to convert the state matrix into a state graph,and then propose a graph deep deterministic policy gradient(DDPG)algorithm with the desired permutation property.To demonstrate how to apply the proposed methods,we consider a representative problem of using reinforcement learning,predictive power allocation,which minimizes the energy consumption while ensuring the quality-of-service of each user requesting video streaming.We derive the time complexity required by training the proposed graph DDPG algorithm and fully-connected neural network-based DDPG algorithm in each time step.Simulations show that the graph DDPG algorithm converges much faster and needs much lower time and space complexity than existing DDPG algorithms to achieve the same learning performance.
关 键 词:reinforcement learning graph neural network relational priors resource allocation
分 类 号:TP18[自动化与计算机技术—控制理论与控制工程]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.222