基于多智能体强化学习的微装配任务规划方法  

Microassembly Task Planning Method Based on Multi-agent Reinforcement Learning

在线阅读下载全文

作  者:徐兴辉 唐大林[2] 顾书豪 左家祺 王晓东 任同群[2,3] XU Xinghui;TANG Dalin;GU Shuhao;ZUO Jiaqi;WANG Xiaodong;REN Tongqun(Dalian University of Technology,State Key Laboratory of High-Performance Precision Manufacturing,Dalian 116024,China;Beijing Aerospace Measurement&.Control Technology Co.,Ltd.,Beijing 100041,China;Dalian University of Technology,Key Laboratory for Micro/Nano Technology and System of Liaoning Province,Dalian 116024,China)

机构地区:[1]大连理工大学微纳米技术及系统辽宁省重点实验室,辽宁大连116024 [2]北京航天测控技术有限公司,北京100041 [3]大连理工大学高性能精密制造全国重点实验室,辽宁大连116024

出  处:《计算机测量与控制》2023年第8期217-223,共7页Computer Measurement &Control

基  金:国家重点研发计划资助项目(2019YFB1310901);辽宁省“兴辽英才计划”资助项目(XLYC2002020);辽宁省自然科学基金项目(2020-MS-104)。

摘  要:现有装配任务规划方式多为人工规划,存在低效、高成本、易误操作等问题,为此分析了微装配操作的任务特点,以及对微装配中多操作臂协作与竞争关系进行了详细分析,并提出多智能体强化学习中符合微装配任务特点的动作空间、状态空间以及奖励函数的构建方法;利用CoppeliaSim仿真软件构建合理的仿真模型,对已有设备进行物理建模;构建了基于多智能体深度确定性策略梯度算法的学习模型并进行训练,在仿真环境中对设计的状态、动作空间以及奖励函数进行了逐项实验验证,最终获得了稳定的路径以及完整的任务实施方案;仿真结果表明,提出的环境构建方法,更契合直角坐标运动为主要框架的微装配任务,能够克服现有规划方法的不足,能够实现可实际工程化的多臂协同操作,提高任务的效率以及规划的自动化程度。The existing planning methods mostly are manual planning,which have problems such as inefficiency,high cost,and easy misoperation.Thus,the characteristics of microassembly operationtasks,collaboration and competition relationship of micro-assembly operation are analyzed in detail,and a method for the construction of action,state and reward conditions that conforms to the characteristics of micro-assembly tasksin multi-agent reinforcement learningis proposed.Using CoppeliaSim simulation software to model existing equipment physically,a learning model is built and trained based on multi-agent deep deterministic policy gradientalgorithm,then the designed action,state and reward function are verified experimentally in simulation environment.Ultimately a stable path and complete task implementation scheme is obtained.The simulation results show that the proposed method is more suitable for the micro-assembly task with Cartesian coordinate motion as the main framework,and can overcome the shortcomings of existing planning methods.Besides,the method can realize the multi manipulator arm collaborative operation,which can be practically engineered and improve the efficiency of the task and the automation degree of planning.

关 键 词:多智能体强化学习 奖励函数 微装配 任务规划 仿真环境构建 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象