一种适用于非稳态浅海信道的强化学习自适应调制方案  

A reinforced learning adaptive modulation scheme for non-stationary shallow sea channels

在线阅读下载全文

作  者:邱逸凡 张小康[1,2] 陈东升[1,2,3] 童峰 QIU Yifan;ZHANG Xiaokang;CHEN Dongsheng;TONG Feng(Key Laboratory of Underwater Acoustic Communication and Marine Information Technology of the Ministry of Education,Xiamen University,Xiamen 361005,China;College of Ocean and Earth Sicences,Xiamen University,Xiamen 361102,China;Shenzhen Research Institute of Xiamen University,Shenzhen 518000,China)

机构地区:[1]厦门大学水声通信与海洋信息技术教育部重点实验室,福建厦门361005 [2]厦门大学海洋与地球学院,福建厦门361102 [3]厦门大学深圳研究院,广东深圳518000

出  处:《厦门大学学报(自然科学版)》2022年第6期1072-1081,共10页Journal of Xiamen University:Natural Science

基  金:国家自然科学基金(11274259);福建省自然科学基金(2018J05071);深圳虚拟大学园扶持经费研发机构建设项目(YFJGJS1.0)。

摘  要:在时-空-频随机变化的浅海水声信道条件下,采用单一调制方式的水声通信系统难以权衡稳定性和通信速率,无法适应海洋信息可靠、高效传输的应用需求,因此自适应调制成为提高水声通信环境适应性的重要技术手段之一.但是,由于水声信道下传输时延增长,传统基于阈值判断和反馈的自适应调制方案存在反馈信息过时的问题,导致系统性能下降.本文将强化学习中的重复更新Q学习(RUQL)算法引入浅海信道自适应调制,用信噪比和多普勒频偏表征信道的状态变化,通过与环境交互学习信道的变化,经过多次迭代学习最优策略,实现多通信制式的自优化调整.实验结果表明,相对传统的基于门限阈值判断调整调制参数的自适应调制方案,本文设计的强化学习自适应浅海水声调制方案在系统吞吐量和误码率上均有明显提升,且相较于传统的Q学习算法有更快的收敛速度.Due to random changes in the time-space-frequency domain in shallow seawater acoustic channels,it is difficult to balance the stability and the communication rate in the underwater acoustic communication system using a single modulation method.This difficulty suggests that single modulation cannot meet the application requirements of reliable and efficient transmission of the marine information.Consequently,adaptive modulation has become one of important technical means to improve the adaptability of underwater acoustic communication environment.However,because of long transmission delay in the underwater acoustic channel,a problem of outdated feedback information remains.This problem lies in that,when traditional adaptive modulation schemes based on threshold judgment and feedback are used,systems perform poorly.In this paper,a repeated update Q-learning(RUQL)algorithm in reinforcement learning is developed for the shallow sea channel adaptive modulation,through which the signal-to-noise ratio and Doppler frequency offset are used to characterize the channel state information,and the channel changes are learned through the establishment of a Q-table.After the iterative learning,the optimal strategy is learned.Experimental results are provided to confirm that the adaptive modulation scheme based on reinforcement learning proposed herein achieves significant improvement in system throughputs and bit error rates compared with the traditional method,and secures faster convergence speeds than traditional Q-learning algorithms.

关 键 词:重复更新Q学习 自适应调制 浅海信道 非稳态 

分 类 号:TN929.3[电子电信—通信与信息系统]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象