基于DDPG的智能反射面辅助无线携能通信系统性能优化  被引量:1

DDPG-based performance optimization algorithm for IRS-assisted simultaneous wireless information and power transfer systems

在线阅读下载全文

作  者:罗丽平[1] 潘伟民 LUO Liping;PAN Weimin(College of Electronic Information,Guangxi Minzu University,Nanning 530006,China)

机构地区:[1]广西民族大学电子信息学院,广西南宁530006

出  处:《物联网学报》2024年第2期46-55,共10页Chinese Journal on Internet of Things

基  金:广西科技重大专项(No.AA23073006);广西民族大学研究生创新计划(No.gxun-chxs2022298)。

摘  要:针对智能反射面(IRS, intelligent reflecting surface)辅助的多输入单输出(MISO, multiple input singleoutput)无线携能通信(SWIPT, simultaneous wireless information and power transfer)系统,考虑基站最大发射功率、IRS反射相移矩阵的单位膜约束和能量接收器的最小能量约束,以最大化信息传输速率为目标,联合优化了基站处的波束成形向量和智能反射面的反射波束成形向量。为解决非凸优化问题,提出了一种基于深度强化学习的深度确定性策略梯度(DDPG, deep deterministic policy gradient)算法。仿真结果表明,DDPG算法的平均奖励与学习率有关,在选取合适的学习率的条件下,DDPG算法能获得与传统优化算法相近的平均互信息,但运行时间明显低于传统的非凸优化算法,即使增加天线数和反射单元数,DDPG算法依然可以在较短的时间内收敛。这说明DDPG算法能有效地提高计算效率,更适合实时性要求较高的通信业务。For the intelligent reflecting surface(IRS)-assisted multiple input single output(MISO)simultaneous wireless information and power transfer(SWIPT)system,the beam forming vector at the base station and the reflected beam forming vector of the IRS were jointly optimized,by considering the maximum transmit power of the base station,the unit modulus constraint of the IRS reflection phase shift matrix,and the minimum energy constraint of the energy receiver.The object was to maximize the spectrum efficiency.To solve the non-convex optimization problem,a deep de‐terministic policy gradient(DDPG)algorithm based on deep reinforcement learning was proposed.Simulation results show that the average reward of the DDPG algorithm is related to the learning rate.Under the condition of selecting the appropriate learning rate,the DDPG algorithm can obtain an average mutual information similar to that of the traditional optimization algorithm,but the running time is significantly lower than that of the traditional non-convex optimization algorithm.Even if the number of antennas and the number of reflective units are increased,the DDPG algorithm can still converge in a short period of time.This indicates that the DDPG algorithm can effectively improve the computational effi‐ciency and is suitable for communication services with high real-time requirements.

关 键 词:多输入单输出 无线携能通信 智能反射面 波束成形 深度确定性策略梯度 

分 类 号:TN929.5[电子电信—通信与信息系统] TP18[电子电信—信息与通信工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象