检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:严一冰 宋爱国[1] 朱利丰[1] 闫禹霏 YAN Yibing;SONG Aiguo;ZHU Lifeng;YAN Yufei(School of Instrument Science and Engineering,Southeast University,Nanjing 210096,China)
机构地区:[1]东南大学仪器科学与工程学院,江苏南京210096
出 处:《测控技术》2023年第4期35-41,共7页Measurement & Control Technology
基 金:国防基础科研项目(JCKY2022110C040)。
摘 要:由于颗粒状介质的复杂性,机器人操控颗粒介质的交互控制存在不少难点。近年来,强化学习在机器人技术等领域取得了显著的成就。针对土壤这类颗粒物质不容易进行运动学建模的问题,提出一种基于离线强化学习的土壤采样控制器,引入采样过程中力触觉信息指导取样的运动控制。基于旋转式取样头,使用高精度Chrono仿真器对土壤环境进行颗粒模拟,以获得土壤采样任务的数据。从采样量、功率、用时这3个角度考虑,共同衡量采样任务的表现。使用保守Q-Learning算法进行离线训练,解决强化学习中样本利用率低且在颗粒仿真环境下实现实时交互困难的问题,并输出智能体控制取样器下降速度和旋转取样角速度的策略。实验表明,该控制器可以较好地应用于土壤采样任务,并可根据所测的力来动态调整控制策略。该控制器可显著提升采样效果,无须明确了解颗粒参数,仅通过数据即可学习优化策略。Due to the complexity of granular material,it is difficult for robot manipulator to interactively control its interaction with granular media.In recent years,reinforcement learning has achieved a remarkable achievement in many fields such as robotics.A soil sampling controller based on offline reinforcement learning is presented,which can address the problem of robotic sampling of granular material with the help of haptic data.With a designed rotational soil-sampler,the Chrono simulator is used to perform particle simulation of the soil environment with a high precision,and the training data is obtained for the soil sampling task.The performance of a sampling experiment is measured based on three aspects,including sampling volume,power consumption and time spent.The conservative Q-Learning algorithm with offline training is used to solve the problem of low sample efficiency in reinforcement learning and the difficulty of achieving real-time interaction with granular material,and the agent control strategy of controlling descent velocity and the angular velocity of the sampler is output.Experiments show that the controller can work well for soil sampling tasks and dynamically adjust the system output according to the force data.With a better control policy learned from the offline simulation,the controller significantly improves the sampling results,without clearly knowing the particle parameters.
关 键 词:智能采样 颗粒土壤 离线强化学习 力触觉 数据驱动
分 类 号:TP242[自动化与计算机技术—检测技术与自动化装置]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:18.117.127.127