基于深度强化学习的智能船舶航迹跟踪控制  被引量:26

Tracking control of intelligent ship based on deep reinforcement learning

在线阅读下载全文

作  者:祝亢 黄珍[1] 王绪明[2] ZHU Kang;HUANG Zhen;WANG Xuming(School of Automation,Wuhan University of Technology,Wuhan 430070,China;Intelligent Transport System Research Center,Wuhan University of Technology,Wuhan 430063,China)

机构地区:[1]武汉理工大学自动化学院,湖北武汉430070 [2]武汉理工大学智能交通系统研究中心,湖北武汉430063

出  处:《中国舰船研究》2021年第1期105-113,共9页Chinese Journal of Ship Research

基  金:国家重点研发计划资助项目(2018YFB1601500)。

摘  要:[目的]智能船舶的航迹跟踪控制问题往往面临着控制环境复杂、控制器稳定性不高以及大量的算法计算等问题。为实现对航迹跟踪的精准控制,提出一种引入深度强化学习技术的航向控制器。[方法]首先,结合视线(LOS)算法制导,以船舶的操纵特性和控制要求为基础,将航迹跟踪问题建模成马尔可夫决策过程,设计其状态空间、动作空间、奖励函数;然后,使用深度确定性策略梯度(DDPG)算法作为控制器的实现,采用离线学习方法对控制器进行训练;最后,将训练完成的控制器与BP-PID控制器进行对比研究,分析控制效果。[结果]仿真结果表明,设计的深度强化学习控制器可以从训练学习过程中快速收敛达到控制要求,训练后的网络与BP-PID控制器相比跟踪迅速,具有偏航误差小、舵角变化频率小等优点。[结论]研究成果可为智能船舶航迹跟踪控制提供参考。[Objectives] The tracking control of intelligent ships often faces the problem of low controller stability in complex control environments and manual algorithmic computing. In order to achieve precise tracking control, this paper proposes a controller based on deep reinforcement learning(DRL).[Methods]Guided by the line-of-sight(LOS) algorithm and based on the maneuvering characteristics and control requirements of ships, this paper formulates a path of Markov decision processes by following the control problem, designing its state space, action space and reward by applying a deep deterministic policy gradient(DDPG) algorithm to implement the controller. An off-line learning method was used to train the controller. After the training, a comparison was made with BP-PID control to analyze the control effects.[Results]Simulation results show that the deep reinforcement learning(DRL) controller can rapidly converge from the training process to meet the control requirements, with the advantages of small yaw error, and a visible reduction in the frequency of changes of the rudder angle.[ Conclusions] The study results can provide a reference for the tracking control of intelligent ships.

关 键 词:智能船舶 航迹跟踪控制 深度强化学习 视线导航法 

分 类 号:U664.82[交通运输工程—船舶及航道工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象