基于轨迹引导的移动机器人导航策略优化算法  

Autonomous navigation policy optimization algorithm for mobile robots based on trajectory guidance

在线阅读下载全文

作  者:李忠伟[1] 刘伟鹏 罗偲 Li Zhongwei;Liu Weipeng;Luo Cai(College of Oceanography&Space Informatics,China University of Petroleum(East China),Qingdao Shandong 266580,China)

机构地区:[1]中国石油大学(华东)海洋与空间信息学院,山东青岛266580

出  处:《计算机应用研究》2024年第5期1456-1461,共6页Application Research of Computers

基  金:国家自然科学基金面上项目(62071491);校自主创新科研计划项目(理工科)-战略专项资助项目(22CX01004A-1)。

摘  要:针对在杂乱、障碍物密集的复杂环境下移动机器人使用深度强化学习进行自主导航所面临的探索困难,进而导致学习效率低下的问题,提出了一种基于轨迹引导的导航策略优化(TGNPO)算法。首先,使用模仿学习的方法为移动机器人训练一个能够同时提供专家示范行为与导航轨迹预测功能的专家策略,旨在全面指导深度强化学习训练;其次,将专家策略预测的导航轨迹与当前时刻移动机器人所感知的实时图像进行融合,并结合坐标注意力机制提取对移动机器人未来导航起引导作用的特征区域,提高导航模型的学习性能;最后,使用专家策略预测的导航轨迹对移动机器人的策略轨迹进行约束,降低导航过程中的无效探索和错误决策。通过在仿真和物理平台上部署所提算法,实验结果表明,相较于现有的先进方法,所提算法在导航的学习效率和轨迹平滑方面取得了显著的优势。这充分证明了该算法能够高效、安全地执行机器人导航任务。Addressing the exploration challenges faced by mobile robots using deep reinforcement learning for autonomous navi-gation in cluttered,obstacle-dense complex environments,this paper proposed the trajectory-guided navigation policy optimization(TGNPO)algorithm.Firstly,it employed an imitation learning approach to train an expert policy for a mobile robot,which could provide both expert demonstration behavior and navigation trajectory prediction and aimed to comprehensively guide the training of deep reinforcement learning.Secondly,it fused the predicted navigation trajectory from the expert policy with real-time images perceived by the mobile robot at the current moment.Combining the coordinate attention mechanism,it extracted feature regions which would guide the robot’s future navigation,thereby enhancing the learning performance of the navigation model.Finally,it utilized the navigation trajectory predicted by the expert policy to constrain the policy trajectory of the mobile robot,mitigating ineffective exploration and erroneous decision-making during navigation process By deploying the proposed algorithm on both simulation and physical platforms,experimental results demonstrated significant advantages in navigation learning efficiency and trajectory smoothness compared to existing state-of-the-art methods which fully proves the proposed algorithm’s capability to efficiently and safely execute robot navigation tasks.

关 键 词:移动机器人自主导航 轨迹预测 轨迹-图像融合 轨迹约束 深度强化学习 

分 类 号:TP242.6[自动化与计算机技术—检测技术与自动化装置]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象