检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:赵铭慧 张雪波[1,2] 郭宪 欧勇盛 ZHAO Ming-hui;ZHANG Xue-bo;GUO Xian;OU Yong-sheng(Institute of Robotics and Automatic Information System,Nankai University,Tianjin 300350,China;Key Laboratory of Intelligent Robotics,Tianjin 300350,China;Shenzhen Institutes of Advanced Technology,Chinese Academy of Sciences,Shenzhen 518055,China)
机构地区:[1]南开大学机器人与信息自动化研究所,天津300350 [2]天津市智能机器人技术重点实验室,天津300350 [3]中国科学院深圳先进技术研究院,广东深圳518055
出 处:《控制与决策》2022年第4期861-870,共10页Control and Decision
基 金:国家自然科学基金项目(U1613210);天津市杰出青年科学基金项目(19JCJQJC62100);天津市自然科学基金项目(19JCYBJC18500);中央高校基本科研业务费专项基金项目.
摘 要:对于装配序列规划问题,现有算法大多聚焦于单一的目标构型.对于多目标构型以及大规模问题,现有算法往往存在维数灾难及泛化能力差等问题.为此,利用装配序列规划问题分层结构的特点,提出一种基于分层强化学习的适用于多构型装配任务的通用装配序列规划方法.首先,将装配序列规划问题构建为一个分层的马尔科夫决策过程,其中,上层进行序列规划,下层进行零件的动作规划,符合装配过程层次化的结构,使规划方法更具灵活性,且可解释性更强;其次,针对分层马尔科夫决策过程,提出一种基于分层强化学习的通用装配序列规划算法,提高规划方法对多种目标构型任务的适应能力和泛化能力,以及对目标构型的信息利用率;最后,在搭建的仿真平台上进行验证,结果表明所提方法可以提取到关于装配问题的广义信息,对于不同零件初始位置以及其他多种构型装配任务均具有较好的决策能力,从而验证所提方法的有效性和通用性,表明该算法是适用于多目标构型的更加通用灵活的装配序列规划算法.For assembly sequence planning problems,most of the existing algorithms focus on a single target configuration.For multi-target configurations and large-scale problems,existing algorithms often have dimension disaster problems with poor generalization ability.Therefore,this paper uses the characteristics of the hierarchical structure of assembly sequence planning problems and conducts a general assembly sequence planning method based on hierarchical reinforcement learning,which is suitable for multi-configuration assembly tasks.First of all,this paper constructs the assembly sequence planning problem as a hierarchical Markov decision process,in which the upper layer performs sequence planning,and the lower layer carries out workpiece motion planning,which conforms to the hierarchical structure of the assembly process,making the planning method more flexible and interpretable.Then,in view of the hierarchical Markov decision process,this paper proposes a general assembly sequence planning algorithm based on hierarchical reinforcement learning,which improves the adaptability and generalization ability of the planning method to multi-target tasks and the information utilization of the target configuration.Finally,the proposed method is verified on the built simulation platform.The results show that the proposed method can extract general information about assembly problems,and it has good decision-making ability for any initial state and other various configurations assembly tasks,which verifies the effectiveness and flexibility of the method.Thus,a more general and flexible assembly sequence planning algorithm suitable for multiple configurations is realized.
关 键 词:智能装配 装配序列规划 深度强化学习 目标导向 分层强化学习 多构型任务
分 类 号:TP273[自动化与计算机技术—检测技术与自动化装置]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.46