多任务约束条件下基于强化学习的水面无人艇路径规划算法  被引量:7

Path planning for USV based on reinforcement learning with multi-task constraints

在线阅读下载全文

作  者:封佳祥 江坤颐 周彬[1] 袁志豪 FENG Jia-xiang;JIANG Kun-yi;ZHOU Bin;YUAN Zhi-hao(Science and Technology on Underwater Vehicle Laboratory,Harbin Engineering University,Harbin 150001,China)

机构地区:[1]哈尔滨工程大学水下机器人技术重点实验室

出  处:《舰船科学技术》2019年第23期140-146,共7页Ship Science and Technology

摘  要:本文提出一种多任务约束条件下基于强化学习的水面无人艇路径规划算法。利用灰色预测进行区域建议,提升神经网络检测连续视频帧中水面目标的速度和准确率,进而提高了路径规划环境建模的准确性。基于Q_learning算法进行在线训练,完成多任务约束条件下的无人艇路径规划。针对Q_learning算法在多任务约束条件下收敛较慢的问题,提出了一种基于任务分解奖赏函数的Q_learning算法。通过仿真试验,验证了在多任务约束条件下,采用强化学习进行路径规划的可行性,并通过实物试验,验证了该算法能够满足实际要求。This paper presents a path planning algorithm for USV based on reinforcement learning with multi-task constraints.Grey model is used to propose region proposal,so that the neural network will achieve higher speed and accuracy when detecting targets in continuous video frames,and the accuracy of environment modeling for path planning will improve.Online training based on Q_learning algorithm to complete path planning of USV under multi-task constraints.To avoid the problem that Q_learning algorithm converges slowly under multi-task constraints,a Q_learning algorithm based on task decomposition reward function is proposed.The feasibility of using reinforcement learning to perform path planning under multi-task constraints is verified by simulation experiments,and the physical experiments is carried out to verify that the algorithm can meet the actual requirements.

关 键 词:水面无人艇 路径规划 强化学习 目标检测 

分 类 号:U664[交通运输工程—船舶及航道工程] TP39[交通运输工程—船舶与海洋工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象