检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:李淑怡 阳波[2] 陈灵 沈玲 唐文胜 LI Shuyi;YANG Bo;CHEN Ling;SHEN Ling;TANG Wensheng(College of Information Science and Engineering,Hunan Normal University,Changsha 410081,Hunan,China;College of Engineering and Design,Hunan Normal University,Changsha 410081,Hunan,China)
机构地区:[1]湖南师范大学信息科学与工程学院,湖南长沙410081 [2]湖南师范大学工程与设计学院,湖南长沙410081
出 处:《计算机工程》2025年第3期86-94,共9页Computer Engineering
基 金:国家自然科学基金青年项目(62203167)。
摘 要:针对机器人清洁作业过程中现有曲面覆盖方法难以适应曲面变化且覆盖效率低的问题,提出一种自适应奖励函数的近端策略优化(PPO)曲面覆盖方法(SC-SRPPO)。首先,将目标曲面离散化,以球查询方式获得协方差矩阵,求解点云的法向量,建立3D曲面模型;其次,以曲面局部点云的覆盖状态特征和曲率变化特征作为曲面模型观测值以构建状态模型,有利于机器人移动轨迹拟合曲面,提高机器人对曲面变化的适应能力;接着,基于曲面的全局覆盖率和与时间相关的指数模型构建一种自适应奖励函数,引导机器人向未覆盖区域移动,提高覆盖效率;最后,将曲面局部状态模型、奖励函数、PPO强化学习算法相融合,训练机器人完成曲面覆盖路径规划任务。在球形、马鞍形、立体心形等3种曲面模型上,以点云覆盖率与覆盖完成时间作为主要评价指标进行实验,结果表明,SC-SRPPO的平均覆盖率为90.72%,与NSGA Ⅱ、PPO、SAC这3种方法对比,覆盖率分别提升4.98%、14.56%、27.11%,覆盖完成时间分别缩短15.20%、67.18%、62.64%。SC-SRPPO能够在适应曲面变化的基础上使机器人更加高效地完成曲面覆盖任务。Existing surface coverage methods are difficult to adapt to surface changes,and their coverage efficiency in robot cleaning operations is low.This paper proposes a surface coverage method based on Proximal Policy Optimization(PPO),namely SC-SRPPO,with an adaptive reward function.First,the target surface is discretized and the covariance matrix is obtained via spherical query to solve the normal vector of the point cloud,which is then used to establish the 3D surface model.Second,a state model is constructed using the coverage state and curvature change features of the surface local point cloud as the observation value of the surface model,which guides the robot to fit the surface during movement and improves the adaptability of the robot to the surface.Subsequently,based on the global coverage of the surface and the time-related exponential model,an adaptive reward function is constructed to guide the robot to move to the uncovered area as soon as possible and improve coverage efficiency.Finally,the local state model and reward function of the surface are combined with the PPO algorithm to train the robot to complete surface coverage path planning.The average coverage rate on the sphere of SC-SRPPO was 90.72%for the hyperboloid and heart models.Comparing the NSGA Ⅱ,PPO,and SAC,the coverage rate increased by 4.98%,14.56%,and 27.11%,respectively,while the coverage completion time was reduced by 15.20%,67.18%,and 62.64%,respectively.The results show that SC-SRPPO can make the robot complete the surface-covering task more efficiently than NSGA Ⅱ and SAC by adapting to surface changes.
关 键 词:清洁机器人 曲面 覆盖路径规划 强化学习 近端策略优化
分 类 号:TP242.3[自动化与计算机技术—检测技术与自动化装置]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.7