基于深度强化学习的多潜器编队控制算法设计  被引量:9

Design of formation control algorithm for multiple autonomous underwater vehicles based on deep reinforcement learning

在线阅读下载全文

作  者:闫敬[1] 徐龙 曹文强 杨睍 罗小元[1] YAN Jing;XU Long;CAO Wen-qiang;YANG Xian;LUO Xiao-yuan(School of Electrical Engineering,Yanshan University,Qinhuangdao 066004,China)

机构地区:[1]燕山大学电气工程学院,河北秦皇岛066004

出  处:《控制与决策》2023年第5期1457-1463,共7页Control and Decision

基  金:国家自然科学基金项目(62222314,61973263,61873345,62033011);河北省自然科学基金项目(2022203001,BJ2020031);河北省中央引导地方基金项目(226Z3201G)。

摘  要:考虑水下未知信道与不确定模型参数,提出基于深度强化学习的多潜器编队控制算法.首先,提出基于环境采样数据的最小二乘估计器,用于预测在衰落环境下的未知信道;其次,根据信道预测估计器得出的信噪比(SNR),建立通信有效性与编队稳定性的联合优化问题,并给出基于深度强化学习-深度确定性策略梯度算法(DDPG)的编队控制算法;最后,通过仿真与实验结果验证所提出算法的有效性,参考仿真结果并相比于直接编队控制,考虑通信有效性的情况下所提出算法提升了13.5%的通信性能.This paper considers the unknown underwater channel and the uncertain model parameters,and hence,a multiple autonomous underwater vehicles(AUVs)formation control algorithm based on deep reinforcement learning is proposed.Firstly,a least square estimator based on environmental sampling data is developed to predict the unknown channel in fading environment.Then,according to the sigal-noise ratio(SNR)obtained by the channel prediction estimator,the co-optimization problem of communication effectiveness and formation stability is established.Based on this,the formation control algorithm based on the depth deterministic strategy gradient algorithm(DDPG)is designed.Finally,simulation and experimental results verify the effectiveness of the proposed algorithm.According to the simulation results,compared with the direct formation control,the communication performance is improved by 13.5%considering the communication efficiency.

关 键 词:信道预测 潜器 编队控制 深度强化学习 联合优化 

分 类 号:TP273[自动化与计算机技术—检测技术与自动化装置]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象