检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:陈仁祥 祝宇航 杨黎霞 何家乐 唐煜斌 CHEN Renxiang;ZHU Yuhang;YANG Lixia;HE Jiale;TANG Yubin(School of Mechatronics and Vehicle Engineering,Chongqing Jiaotong University,Chongqing 400074,China;Business and Management College,Chongqing University of Science and Technology,Chongqing 401331,China)
机构地区:[1]重庆交通大学机电与车辆工程学院,重庆400074 [2]重庆科技大学工商管理学院,重庆401331
出 处:《中国惯性技术学报》2024年第12期1250-1257,1262,共9页Journal of Chinese Inertial Technology
基 金:国家自然科学基金(51975079);重庆市技术创新与应用示范项目(cstc2018jscx-msybX0012);重庆市教育委员会科学技术研究项目(KJZD-M202200701);交通工程应用机器人重庆市工程实验室开放基金(CELTEAR-KFKT-202002)。
摘 要:针对四足机器人在山地环境中未考虑坡度因素而导致规划路径难以行走的问题,提出了一种基于坡度势能引导强化学习的山地路径规划方法。首先,根据坡度分级原则对山地模型进行划分,结合地形特点引入黑洞原理以改进人工势场法(APF),构建全局坡度势场,降低多维环境复杂度;其次,将势场中的势能经概率加权处理后输入至强化学习网络引导前期训练,加快算法收敛速度。最后,结合四足机器人行走特性提出一种坡度优化方法,将安全坡度范围作为阈值,通过坡度奖励函数对状态进行调节和优化。仿真结果表明,与近端策略优化算法(PPO)和两种改进强化学习算法相比,所提算法收敛性更好,安全到达目标点的规划成功率提高16.45%以上,路径最大坡度降低34.4%以上,能够规划出一条平均坡度在21°~25°的平稳路径。A mountain path planning method based on slope potential energy guided reinforcement learning is proposed to address the problem of quadruped robots not considering slope factors in mountain environments,which makes it difficult to plan paths.Firstly,the mountain model is divided according to the slope classification principle,and the black hole principle is introduced to improve the artificial potential field(APF)algorithm based on the terrain characteristics,and the global slope potential field is constructed to reduce the complexity of multi-dimensional environment.Secondly,the potential energy in the potential field is input into the reinforcement learning network after probability weighted processing to guide the early training,so as to accelerate the convergence speed of the algorithm.Finally,a slope optimization method is proposed based on the walking characteristics of quadruped robots,where the safe slope range is used as a threshold and the state is adjusted and optimized through the slope reward function.The simulation results show that compared with the proximal policy optimization(PPO)algorithm and two improved reinforcement learning algorithms,the proposed algorithm has better convergence,a success rate of safely reaching the target increase of over 16.45%,and a maximum path slope decrease of over 34.4%,and a stable path with an average slope of 21°~25°can be planned.
关 键 词:山地环境 路径规划 四足机器人 强化学习 人工势场法
分 类 号:TP242[自动化与计算机技术—检测技术与自动化装置]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.117