安全凸空间与深度强化学习结合的机器人导航方法

Research on Robot Navigation Method Integrating Safe Convex Space and Deep Reinforcement Learning

作　　者：董明泽温庄磊陈锡爱杨炅坤曾涛 DONG Mingze;WEN Zhuanglei;CHEN Xiai;YANG Jiongkun;ZENG Tao(College of Mechanical and Electrical Engineering,China Jiliang University,Hangzhou 310018,Zhejiang,China)

机构地区：[1]中国计量大学机电工程学院,浙江杭州310018

出　　处：《兵工学报》2024年第12期4372-4382,共11页Acta Armamentarii

基　　金：国家自然科学基金项目(52005472);浙江省自然科学基金探索项目(LQ20E050015)。

摘　　要：针对机器人在全局地图未知且环境内存在动态和静态障碍物场景中的导航问题,提出一种基于深度强化学习(Deep Reinforcement Learning,DRL)的移动机器人导航方法。相较于其他应用于复杂动态环境的DRL机器人导航方法,该方法在动作空间设计、状态空间设计和奖励函数设计上进行了改进,并采用了控制环节与神经网络分离的设计,有助于将仿真研究便捷有效地实现在各类机器人的实际应用中。在动作空间设计上,为缩小可行轨迹的采样空间并同时满足短期动态避障和长期的全局导航需求,将通过激光点云数据计算得到的安全凸空间与机器人运动学极限的交集设定为机器人的动作空间,并从该动作空间中采样出参考位置点形成参考路径,而后机器人通过模型预测控制算法对参考路径进行跟踪。在状态空间和奖励函数的设计上,额外添加了安全凸空间、长短期参考位置点等元素。消融实验结果表明,该设计在各种静态和动态环境中都能达到更高的导航成功率、更短的耗时,并且具有较强的泛化能力。A robot navigation method based on deep reinforcement learning(DRL)is proposed for navigating the a robot in the scenario where the global map is unknown and there are dynamic and static obstacles in the environment.Compared to other DRL-based navigation methods applied in complex dynamic environment,the improvements in the designs of action space,state space,and reward function are introduced into the proposed method.Additionally,the proposed method separates the control process from neural network,thus facilitating the simulation research to be effectively implemented in practice.Specifically,the action space is defined by intersecting the safe convex space,calculated from 2D Lidar data,with the kinematic limits of robot.This intersection narrows down the feasible trajectory search space while meeting both short-term dynamic obstacle avoidance and long-term global navigation needs.Reference points are sampled from this action space to form a reference trajectory that the robot follows using a model predictive control(MPC)algorithm.The method also incorporates additional elements such as safe convex space and reference points in the design of state space and reward function.Ablation studies demonstrate the superior navigation success rate,reduced time consumption,and robust generalization capabilities of the proposed method in various static and dynamic environments.

关键词：移动机器人导航深度强化学习安全凸空间模型预测控制动态未知环境

分类号：TP242.6[自动化与计算机技术—检测技术与自动化装置]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

安全凸空间与深度强化学习结合的机器人导航方法

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

安全凸空间与深度强化学习结合的机器人导航方法

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索