基于分层强化学习的通用装配序列规划算法  被引量:6

A general assembly sequence planning algorithm based on hierarchical reinforcement learning

在线阅读下载全文

作  者:赵铭慧 张雪波[1,2] 郭宪 欧勇盛 ZHAO Ming-hui;ZHANG Xue-bo;GUO Xian;OU Yong-sheng(Institute of Robotics and Automatic Information System,Nankai University,Tianjin 300350,China;Key Laboratory of Intelligent Robotics,Tianjin 300350,China;Shenzhen Institutes of Advanced Technology,Chinese Academy of Sciences,Shenzhen 518055,China)

机构地区:[1]南开大学机器人与信息自动化研究所,天津300350 [2]天津市智能机器人技术重点实验室,天津300350 [3]中国科学院深圳先进技术研究院,广东深圳518055

出  处:《控制与决策》2022年第4期861-870,共10页Control and Decision

基  金:国家自然科学基金项目(U1613210);天津市杰出青年科学基金项目(19JCJQJC62100);天津市自然科学基金项目(19JCYBJC18500);中央高校基本科研业务费专项基金项目.

摘  要:对于装配序列规划问题,现有算法大多聚焦于单一的目标构型.对于多目标构型以及大规模问题,现有算法往往存在维数灾难及泛化能力差等问题.为此,利用装配序列规划问题分层结构的特点,提出一种基于分层强化学习的适用于多构型装配任务的通用装配序列规划方法.首先,将装配序列规划问题构建为一个分层的马尔科夫决策过程,其中,上层进行序列规划,下层进行零件的动作规划,符合装配过程层次化的结构,使规划方法更具灵活性,且可解释性更强;其次,针对分层马尔科夫决策过程,提出一种基于分层强化学习的通用装配序列规划算法,提高规划方法对多种目标构型任务的适应能力和泛化能力,以及对目标构型的信息利用率;最后,在搭建的仿真平台上进行验证,结果表明所提方法可以提取到关于装配问题的广义信息,对于不同零件初始位置以及其他多种构型装配任务均具有较好的决策能力,从而验证所提方法的有效性和通用性,表明该算法是适用于多目标构型的更加通用灵活的装配序列规划算法.For assembly sequence planning problems,most of the existing algorithms focus on a single target configuration.For multi-target configurations and large-scale problems,existing algorithms often have dimension disaster problems with poor generalization ability.Therefore,this paper uses the characteristics of the hierarchical structure of assembly sequence planning problems and conducts a general assembly sequence planning method based on hierarchical reinforcement learning,which is suitable for multi-configuration assembly tasks.First of all,this paper constructs the assembly sequence planning problem as a hierarchical Markov decision process,in which the upper layer performs sequence planning,and the lower layer carries out workpiece motion planning,which conforms to the hierarchical structure of the assembly process,making the planning method more flexible and interpretable.Then,in view of the hierarchical Markov decision process,this paper proposes a general assembly sequence planning algorithm based on hierarchical reinforcement learning,which improves the adaptability and generalization ability of the planning method to multi-target tasks and the information utilization of the target configuration.Finally,the proposed method is verified on the built simulation platform.The results show that the proposed method can extract general information about assembly problems,and it has good decision-making ability for any initial state and other various configurations assembly tasks,which verifies the effectiveness and flexibility of the method.Thus,a more general and flexible assembly sequence planning algorithm suitable for multiple configurations is realized.

关 键 词:智能装配 装配序列规划 深度强化学习 目标导向 分层强化学习 多构型任务 

分 类 号:TP273[自动化与计算机技术—检测技术与自动化装置]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象