元强化学习在AUV多任务快速自适应控制的应用

Application of meta-reinforcement learning in AUV multi-task rapid adaptive control

作　　者：徐春晖[1,2] 杨士霖徐德胜方田 XU Chunhui;YANG Sili;XU Desheng;FANG Tian(State Key Laboratory of Robotics,Shenyang Institute of Automation,Chinese Academy of Sciences,Shenyang 110016,China;Key Laboratory of Marine Robotics,Liaoning Province,Shenyang 110169,China;University of Chinese Academy of Sciences,Beijing 100049,China)

机构地区：[1]中国科学院沈阳自动化研究所机器人学国家重点实验室,辽宁沈阳110016 [2]辽宁省水下机器人重点实验室,辽宁沈阳110169 [3]中国科学院大学,北京100049

出　　处：《舰船科学技术》2025年第5期89-96,共8页Ship Science and Technology

基　　金：国家重点研发计划项目(2022YFC2806000)。

摘　　要：为解决基于深度强化学习的AUV跟踪控制器在面临新任务时需从零开始训练、训练速度慢、稳定性差等问题,设计一种基于元强化学习的AUV多任务快速自适应控制算法——R-SAC(Reptile-Soft Actor Critic)算法。R-SAC算法将元学习与强化学习相结合,结合水下机器人运动学及动力学方程对跟踪任务进行建模,利用RSAC算法在训练阶段为AUV跟踪控制器获得一组最优初始值模型参数,使模型在面临不同的任务时,基于该组参数进行训练时能够快速收敛,实现快速自适应不同任务。仿真结果表明,所提出的方法与随机初始化强化学习控制器相比,收敛速度最低提高了1.6倍,跟踪误差保持在2.8%以内。To address the issue of AUV tracking controllers based on deep reinforcement learning requiring retraining from scratch for new tasks,exhibiting slow training speeds,and poor stability,a multi-task rapid adaptive control algorithm based on meta-reinforcement learning,termed R-SAC(Reptile-Soft Actor Critic),has been designed.The R-SAC algorithm integrates meta-learning with reinforcement learning and models the tracking tasks by combining underwater vehicle kin-ematic and dynamic equations.During the training phase,the R-SAC algorithm obtains a set of optimal initial model para-meters for the AUV tracking controller,enabling the model to achieve fast convergence when facing different tasks and thus rapidly adapt to various tasks.Simulation results demonstrate that the proposed method improves convergence speed by at least 1.6 times compared to randomly initialized reinforcement learning controllers,with tracking errors maintained within 2.8%.

关键词：AUV 元强化学习最优初始值模型参数快速收敛

分类号：TP242.6[自动化与计算机技术—检测技术与自动化装置]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

元强化学习在AUV多任务快速自适应控制的应用

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

元强化学习在AUV多任务快速自适应控制的应用

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索