检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:明鑫[1] 卢丹萍[1] 陈中 MING Xin;LU Danping;CHEN Zhong(Intelligent Manufacturing College of Guangxi Vocational&Technical College,Nanning Guangxi 530226,China;Intelligent Manufacturing College of Nanning University,Nanning Guangxi 530299,China)
机构地区:[1]广西职业技术学院智能制造学院,广西南宁530226 [2]南宁学院智能制造学院,广西南宁530299
出 处:《机床与液压》2023年第11期65-71,共7页Machine Tool & Hydraulics
基 金:2020年广西教科研项目(2020KY29017)。
摘 要:针对视觉机器人抓取目标的稳定性和准确性,提出一种关注探索方法的强化学习策略。采用深度关注的确定性策略梯度算法,利用关注区域建议网络来选择预勘探区域消息,并通过自适应探索方法计算该消息,以随着目标的变化调整策略。根据末端执行器与预勘探区域中心之间的距离,定义分层奖励函数,用于减少稀疏奖励矩阵带来的杂项信息。在Bullet3环境下进行了训练,实验结果表明:所提策略能够克服训练过程中可能出现的稳定性差和收敛效率低的问题,能产生抗噪声干扰的鲁棒控制,且具有较高的抓取成功率。Aiming at the stability and accuracy of visual robot grasping target,a reinforcement learning strategy focusing on exploration method was proposed.The deterministic strategy gradient algorithm with deep concern was adopted.The information of pre-exploration area was selected by using the region of interest recommendation network,and the information was calculated by the adaptive exploration method to adjust the strategy with the change of the target.According to the distance between the end effector and the center of the pre-exploration area,a hierarchical reward function was defined to reduce the miscellaneous information brought by the sparse reward matrix.The training was carried out in Bullet3 environment.The experiments show that the proposed strategy can overcome the problems of poor stability and low convergence efficiency that may occur in the training process,can produce robust control against noise interference,and has a high success rate of grasping.
分 类 号:TP242.6[自动化与计算机技术—检测技术与自动化装置]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.144.95.186