空间机器人抓捕非合作目标的自主强化学习控制被引量：7

Autonomous reinforcement learning control for space robot to capture non-cooperative targets

作　　者：刘帅邬树楠[1] 刘宇飞[2] 吴志刚[1] 毛子铭 LIU Shuai;WU ShuNan;LIU YuFei;WU ZhiGang;MAO ZiMing(School of Aeronautics and Astronautics,Dalian University of Technology,Dalian 116024,China;Tsien Hsueshen Laboratory of Space Technology,China Academy of Space Technology,Beijing 100094,China;Department of Engineering Mechanics,Dalian University of Technology,Dalian 116024,China)

机构地区：[1]大连理工大学航空航天学院,大连116024 [2]中国空间技术研究院钱学森空间技术实验室,北京100094 [3]大连理工大学工程力学系,大连116024

出　　处：《中国科学：物理学、力学、天文学》2019年第2期109-118,共10页Scientia Sinica Physica,Mechanica & Astronomica

基　　金：国家自然科学基金(编号:91748203)资助项目

摘　　要：近年来,空间机器人在轨服务已成为许多国家的研究热点.本文针对空间机器人抓捕非合作目标任务,提出了一种强化学习控制与PD控制组成的双回路控制方法,对空间机器人基座平台姿态与机械臂运动进行控制.首先,对空间机器人的空间任务进行分析,建立包含基座平台姿态与机械臂运动的空间机器人耦合动力学模型;然后,设计双回路控制系统分别对机械臂运动与基座平台姿态进行控制,内回路中将强化学习与模糊理论结合在一起设计控制器对机械臂末端运动进行控制,外回路中采用PD控制对基座平台姿态进行稳定控制;最后,使用所提控制方法进行数值仿真,并与传统PD控制方法作对比,验证所提控制方法的有效性.结果表明,强化学习控制下的机械臂运动过程平稳、控制精度高,与传统PD控制方法相比,具有一定的自主学习性,更加适应抓捕目标的非合作特性.In recent years, space robot on-orbit service has become a research hotspot in many countries. Aiming at the task of capturing non-cooperative targets for space robot, a dual loop control method consisting of reinforcement learning control and PD control is proposed in this paper, which is used to control the attitude of space robot platform and the motion of manipulator arm. Firstly, a coupled dynamic model of the space robot including the motion of the base platform and the robot arm is established. Then, a dual loop control system is designed to control the movement of the robot arm and the attitude of the base platform. In the inner loop, the controller is designed by combining the reinforcement learning and fuzzy theory to control the motion of the end of the robot arm. In the outer loop, the attitude of the base platform is stabilized by PD control. Finally, the proposed control method is used for numerical simulation and compared with the traditional PD control method to verify the effectiveness of the proposed control method. The results show that the robot arm movement process under the control of reinforcement learning is stable and the control precision is high. Compared with the traditional PD control method, it has a certain self-learning ability and is more suitable for the non-cooperative characteristics of the catching targets.

关键词：空间机器人非合作目标强化学习机械臂模糊理论

分类号：TP242[自动化与计算机技术—检测技术与自动化装置]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

空间机器人抓捕非合作目标的自主强化学习控制被引量：7

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

空间机器人抓捕非合作目标的自主强化学习控制 被引量：7

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

空间机器人抓捕非合作目标的自主强化学习控制被引量：7