检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:刘乔寿[1,2,3] 周雄 刘爽 邓义锋 LIU Qiaoshou;ZHOU Xiong;LIU Shuang;DENG Yifeng(School of Communications and Information Engineering,Chongqing University of Posts and Telecommunications,Chongqing 400065,China;Advanced Network and Intelligent Connection Technology Key Laboratory of Chongqing Education Commission of China,Chongqing 400065,China;Chongqing Key Laboratory of Ubiquitous Sensing and Networking,Chongqing 400065,China)
机构地区:[1]重庆邮电大学通信与信息工程学院,重庆400065 [2]先进网络与智能互联技术重庆市高校重点实验室,重庆400065 [3]泛在感知与互联重庆市重点实验室,重庆400065
出 处:《通信学报》2023年第9期104-114,共11页Journal on Communications
基 金:国家自然科学基金资助项目(No.61901075);重庆市教委科学技术基金资助项目(No.KJZDK202200604)。
摘 要:针对正交频分复用系统,提出了一种基于深度强化学习的自适应导频设计算法。将导频设计问题映射为马尔可夫决策过程,导频位置的索引定义为动作,用基于减少均方误差的策略定义奖励函数,使用深度强化学习来更新导频位置。根据信道条件自适应地动态分配导频,从而利用信道特性对抗信道衰落。仿真结果表明,所提算法在3GPP的3种典型多径信道下相较于传统导频均匀分配方案信道估计性能有显著的提升。For orthogonal frequency division multiplexing(OFDM)systems,an adaptive pilot design algorithm based on deep reinforcement learning was proposed.The pilot design problem was formulated as a Markov decision process,where the index of pilot positions was defined as actions.A reward function based on mean squared error(MSE)reduction strategy was formulated,and deep reinforcement learning was employed to update the pilot positions.The pilot was adaptively and dynamically allocated based on channel conditions,thereby utilizing channel characteristics to combat channel fading.The simulation results show that the proposed algorithm has significantly improved channel estimation performance compared with the traditional pilot uniform allocation scheme under three typical multipath channels of 3GPP.
关 键 词:正交频分复用 深度强化学习 马尔可夫决策过程 多径信道
分 类 号:TN92[电子电信—通信与信息系统]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.49