检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:黄哲 刘安[1] HUANG Zhe;LIU An(Zhejiang University,Hangzhou 310007,China)
机构地区:[1]浙江大学,浙江杭州310007
出 处:《移动通信》2024年第10期41-48,共8页Mobile Communications
基 金:国家自然科学基金“基于深度随机优化的联合压缩信道估计与定位跟踪方法”(62071416)。
摘 要:在不远的未来,ISAC系统将同时提供通信和感知服务。ISAC系统需要通过先进的波束优化算法保证所提供服务的质量,并满足形式多样的服务目标和资源约束。通常,波束算法可建模为一个优化问题。然而,基于传统优化理论设计的优化算法仅能处理带有瞬时约束的资源分配问题,而不能处理带有长时间约束的优化问题,从而降低了系统性能。一种可行的解决方案是基于RL理论设计相应算法来解决上述问题。然而,现有的工作主要致力于解决无约束RL问题,对约束强化学习问题关注较少,这限制了强化学习在波束优化问题中的应用。为了克服上述挑战,提出了一种基于CSSCA的RL方法。该方法将原有的目标函数和约束函数替换为对应的凸近似函数,通过求解一系列的凸近似问题,最终可以保证收敛到原问题的KKT点。最后,通过仿真结果展示了所提出方法的优越性。In future,integrated sensing and communication(ISAC)systems are expected to provide communication and sensing service simultaneously.The systems are required to perform advanced beamforming algorithms to ensure the quality of service and satisfy various types of service targets and resource constraints.In general,the beamforming algorithms can be formulated as an optimization problem.However,the optimization algorithm based on the traditional optimization theory can only address the resource allocation problems with instantaneous constraints and fail to address the problems with long-term constraints,degrading the system performance.One possible solution to overcome the drawbacks of existing algorithms is designing optimization algorithms based on the reinforcement learning.However,the existing algorithms only focus on the unconstrained reinforcement learning problems and pay little attention on the constrained reinforcement learning ones,which restricts the application of reinforcement learning in beamforming algorithm design.To tackle this challenge,we propose a novel reinforcement learning algorithm based on the constrained successive convex approximation method.This method replaces the original objective function and constraint functions with the corresponding convex approximation functions.By solving a series of convex approximation problems,the convergence to the Karush-Kuhn-Tucker(KKT)point of the original problem can be guaranteed.Finally,the simulation results show the superiority of the proposed method.
关 键 词:通信感知一体化 波束优化 深度强化学习 约束随机逐次凸逼近
分 类 号:TN929.5[电子电信—通信与信息系统]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.144.156.43