基于深度强化学习的RIS辅助NOMA网络隐蔽通信方案  

Deep Reinforcement Learning Based Covert Communication for RIS Assisted NOMA Networks

在线阅读下载全文

作  者:张伟 王振军 孙路 崔雅茹 聂础辉 李一鸣 杨刚[3,4] ZHANG Wei;WANG Zhenjun;SUN Lu;CUI Yaru;NIE Chuhui;LI Yiming;YANG Gang(Technical Development Department,Wuhan Tianbaolai Information Technology Co.,Ltd.,Wuhan 430079,China;School of Information Science and Engineering,Hunan Institute of Science and Technology,Yueyang 414006,China;Shenzhen Institute for Advanced Study,UESTC,Shenzhen 518110,China;National Key Laboratory of Wireless Communications,University of Electronic Science and Technology of China,Chengdu 611731,China)

机构地区:[1]武汉天宝莱信息技术有限公司技术开发部,湖北武汉430079 [2]湖南理工学院信息科学与工程学院,湖南岳阳414006 [3]电子科技大学(深圳)高等研究院,广东深圳518110 [4]电子科技大学通信抗干扰全国重点实验室,四川成都611731

出  处:《无线电工程》2025年第4期749-756,共8页Radio Engineering

基  金:湖南省自然科学基金(2024JJ7218,2025JJ70287);深圳市科技计划资助项目(JCYJ20220530164814032);湖南省教育厅项目(23C0217)。

摘  要:针对可重构智能表面(Reconfigurable Intelligent Surface,RIS)辅助非正交多址接入(Non-Orthogonal Multiple Access,NOMA)网络下隐蔽通信场景中传统凸优化方法适应性差、复杂度高的问题,提出一种基于深度强化学习(Deep Reinforcement Learning,DRL)的通用优化算法,突破现有方法在动态环境中的性能瓶颈,为复杂通信场景提供高效解决方案。提出融合截断动作与延迟更新策略的双延迟深度确定性策略梯度算法(Twin Delayed Deep Deterministic policy gradient algorithm,TD3),通过Actor-Critic架构实现基站功率分配与RIS相移的联合优化,以实验隐蔽速率的最大化。仿真结果显示,所提出的TD3融合了截断动作机制与延迟更新策略,有效抑制了策略过估计误差,提升了Actor-Critic架构的稳定性。在用户服务质量(Quality of Service,QoS)约束下,该算法通过联合优化基站功率分配与RIS相移,显著提升了隐蔽通信速率。此外,所提方案能够动态适应信道环境变化,满足隐蔽通信场景的核心需求。Aiming at the issues of poor adaptability and high complexity of traditional convex optimization methods in covert communication scenarios within Reconfigurable Intelligent Surface(RIS)-assisted Non-Orthogonal Multiple Access(NOMA) networks,a general-purpose optimization algorithm based on Deep Reinforcement Learning(DRL) is proposed to break through the performance bottlenecks of existing methods in dynamic environments and provide efficient solutions for complex communication scenarios.A Twin Delayed Deep Deterministic policy gradient algorithm(TD3) is presented,which integrates truncated action mechanisms and delayed update strategies.By leveraging the Actor-Critic architecture,it realizes joint optimization of base station power allocation and RIS phase shifts to maximize the covert rates.Simulation results show that the proposed TD3,integrating truncated action mechanisms and delayed update strategies,effectively suppresses policy overestimation errors and enhances the stability of the Actor-Critic architecture.Under user Quality of Service(QoS) constraints,this algorithm significantly improves the covert communication rate by jointly optimizing the base station power allocation and RIS phase shifts.Furthermore,the proposed scheme can dynamically adapt to the changes in channel environment,meeting the core requirements of covert communication scenarios.

关 键 词:隐蔽通信 深度强化学习 非正交多址接入 可重构智能表面 

分 类 号:TN911[电子电信—通信与信息系统]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象