检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:罗丽平[1] 潘伟民 LUO Liping;PAN Weimin(College of Electronic Information,Guangxi Minzu University,Nanning 530006,China)
机构地区:[1]广西民族大学电子信息学院,广西南宁530006
出 处:《物联网学报》2024年第2期46-55,共10页Chinese Journal on Internet of Things
基 金:广西科技重大专项(No.AA23073006);广西民族大学研究生创新计划(No.gxun-chxs2022298)。
摘 要:针对智能反射面(IRS, intelligent reflecting surface)辅助的多输入单输出(MISO, multiple input singleoutput)无线携能通信(SWIPT, simultaneous wireless information and power transfer)系统,考虑基站最大发射功率、IRS反射相移矩阵的单位膜约束和能量接收器的最小能量约束,以最大化信息传输速率为目标,联合优化了基站处的波束成形向量和智能反射面的反射波束成形向量。为解决非凸优化问题,提出了一种基于深度强化学习的深度确定性策略梯度(DDPG, deep deterministic policy gradient)算法。仿真结果表明,DDPG算法的平均奖励与学习率有关,在选取合适的学习率的条件下,DDPG算法能获得与传统优化算法相近的平均互信息,但运行时间明显低于传统的非凸优化算法,即使增加天线数和反射单元数,DDPG算法依然可以在较短的时间内收敛。这说明DDPG算法能有效地提高计算效率,更适合实时性要求较高的通信业务。For the intelligent reflecting surface(IRS)-assisted multiple input single output(MISO)simultaneous wireless information and power transfer(SWIPT)system,the beam forming vector at the base station and the reflected beam forming vector of the IRS were jointly optimized,by considering the maximum transmit power of the base station,the unit modulus constraint of the IRS reflection phase shift matrix,and the minimum energy constraint of the energy receiver.The object was to maximize the spectrum efficiency.To solve the non-convex optimization problem,a deep de‐terministic policy gradient(DDPG)algorithm based on deep reinforcement learning was proposed.Simulation results show that the average reward of the DDPG algorithm is related to the learning rate.Under the condition of selecting the appropriate learning rate,the DDPG algorithm can obtain an average mutual information similar to that of the traditional optimization algorithm,but the running time is significantly lower than that of the traditional non-convex optimization algorithm.Even if the number of antennas and the number of reflective units are increased,the DDPG algorithm can still converge in a short period of time.This indicates that the DDPG algorithm can effectively improve the computational effi‐ciency and is suitable for communication services with high real-time requirements.
关 键 词:多输入单输出 无线携能通信 智能反射面 波束成形 深度确定性策略梯度
分 类 号:TN929.5[电子电信—通信与信息系统] TP18[电子电信—信息与通信工程]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.49