基于深度确定性策略梯度的船舶自主航行避碰方法被引量：1

A deep deterministic policy gradient method for collision avoidance of autonomous ship

作　　者：胡正阳王勇[1] HU Zhengyang;WANG Yong(Jiangsu Automation Research Institute,Lianyungang 222061,China)

机构地区：[1]江苏自动化研究所,江苏连云港222061

出　　处：《指挥控制与仿真》2024年第5期37-44,共8页Command Control & Simulation

摘　　要：针对不同会遇态势下的船舶自主航行避碰决策问题,在DDPG(Deep Deterministic Policy Gradient)算法基础上,以国际航行规则(COLREGS)为基准设计相应的奖励函数,通过引入势能回报塑形的思想来引导智能体学习最佳策略,保障了智能体在遵守规则的前提下能够有效避障到达航行目标点。最后,作者对双船和多船分别在不同会遇场景下避障问题进行了仿真验证,并与TD3算法进行比较。结果表明:作者设计的算法收敛快,训练效果平稳;生成的模型能在遵守COLREGS的情况下有效避障,并且在两船会遇情况下比TD3算法所规划的路径更短,效率更高。This research addresses the crucial problem of collision avoidance decision making for autonomous ships under diverse encounter situations.Building upon the Deep Deterministic Policy Gradient(DDPG)algorithm,appropriate reward functions based on the International Regulations for Preventing Collisions at Sea(COLREGS)have been designed to effectively guide intelligent agents in acquiring optimal strategies.By incorporating the concept of potential reward shaping,the proposed approach ensures efficient obstacle avoidance while adhering strictly to the established rules.Moreover,extensive simulations have been conducted to validate the algorithm s performance in collision avoidance for both dual-ship and multi-ship scenarios under varying encounter situations,and a comparative analysis with the TD3 algorithm has been undertaken.The obtained results demonstrate that the proposed algorithm exhibits rapid convergence and stable training performance.The resulting models successfully achieve collision-free navigation while strictly adhering to the COLREGS.Particularly,in two-ship encounter situations,the proposed algorithm outperforms the trajectory planned by the TD3 algorithm in terms of shorter path length and higher efficiency.

关键词：无人船舶自主航行避碰深度强化学习 COLREGS

分类号：E919[军事] U666.1[交通运输工程—船舶及航道工程]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于深度确定性策略梯度的船舶自主航行避碰方法被引量：1

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于深度确定性策略梯度的船舶自主航行避碰方法 被引量：1

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

基于深度确定性策略梯度的船舶自主航行避碰方法被引量：1