基于深度强化学习的OFDM自适应导频设计  被引量:2

Adaptive pilot design for OFDM based on deep reinforcement learning

在线阅读下载全文

作  者:刘乔寿[1,2,3] 周雄 刘爽 邓义锋 LIU Qiaoshou;ZHOU Xiong;LIU Shuang;DENG Yifeng(School of Communications and Information Engineering,Chongqing University of Posts and Telecommunications,Chongqing 400065,China;Advanced Network and Intelligent Connection Technology Key Laboratory of Chongqing Education Commission of China,Chongqing 400065,China;Chongqing Key Laboratory of Ubiquitous Sensing and Networking,Chongqing 400065,China)

机构地区:[1]重庆邮电大学通信与信息工程学院,重庆400065 [2]先进网络与智能互联技术重庆市高校重点实验室,重庆400065 [3]泛在感知与互联重庆市重点实验室,重庆400065

出  处:《通信学报》2023年第9期104-114,共11页Journal on Communications

基  金:国家自然科学基金资助项目(No.61901075);重庆市教委科学技术基金资助项目(No.KJZDK202200604)。

摘  要:针对正交频分复用系统,提出了一种基于深度强化学习的自适应导频设计算法。将导频设计问题映射为马尔可夫决策过程,导频位置的索引定义为动作,用基于减少均方误差的策略定义奖励函数,使用深度强化学习来更新导频位置。根据信道条件自适应地动态分配导频,从而利用信道特性对抗信道衰落。仿真结果表明,所提算法在3GPP的3种典型多径信道下相较于传统导频均匀分配方案信道估计性能有显著的提升。For orthogonal frequency division multiplexing(OFDM)systems,an adaptive pilot design algorithm based on deep reinforcement learning was proposed.The pilot design problem was formulated as a Markov decision process,where the index of pilot positions was defined as actions.A reward function based on mean squared error(MSE)reduction strategy was formulated,and deep reinforcement learning was employed to update the pilot positions.The pilot was adaptively and dynamically allocated based on channel conditions,thereby utilizing channel characteristics to combat channel fading.The simulation results show that the proposed algorithm has significantly improved channel estimation performance compared with the traditional pilot uniform allocation scheme under three typical multipath channels of 3GPP.

关 键 词:正交频分复用 深度强化学习 马尔可夫决策过程 多径信道 

分 类 号:TN92[电子电信—通信与信息系统]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象