检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:刘帅 邬树楠[1] 刘宇飞[2] 吴志刚[1] 毛子铭 LIU Shuai;WU ShuNan;LIU YuFei;WU ZhiGang;MAO ZiMing(School of Aeronautics and Astronautics,Dalian University of Technology,Dalian 116024,China;Tsien Hsueshen Laboratory of Space Technology,China Academy of Space Technology,Beijing 100094,China;Department of Engineering Mechanics,Dalian University of Technology,Dalian 116024,China)
机构地区:[1]大连理工大学航空航天学院,大连116024 [2]中国空间技术研究院钱学森空间技术实验室,北京100094 [3]大连理工大学工程力学系,大连116024
出 处:《中国科学:物理学、力学、天文学》2019年第2期109-118,共10页Scientia Sinica Physica,Mechanica & Astronomica
基 金:国家自然科学基金(编号:91748203)资助项目
摘 要:近年来,空间机器人在轨服务已成为许多国家的研究热点.本文针对空间机器人抓捕非合作目标任务,提出了一种强化学习控制与PD控制组成的双回路控制方法,对空间机器人基座平台姿态与机械臂运动进行控制.首先,对空间机器人的空间任务进行分析,建立包含基座平台姿态与机械臂运动的空间机器人耦合动力学模型;然后,设计双回路控制系统分别对机械臂运动与基座平台姿态进行控制,内回路中将强化学习与模糊理论结合在一起设计控制器对机械臂末端运动进行控制,外回路中采用PD控制对基座平台姿态进行稳定控制;最后,使用所提控制方法进行数值仿真,并与传统PD控制方法作对比,验证所提控制方法的有效性.结果表明,强化学习控制下的机械臂运动过程平稳、控制精度高,与传统PD控制方法相比,具有一定的自主学习性,更加适应抓捕目标的非合作特性.In recent years, space robot on-orbit service has become a research hotspot in many countries. Aiming at the task of capturing non-cooperative targets for space robot, a dual loop control method consisting of reinforcement learning control and PD control is proposed in this paper, which is used to control the attitude of space robot platform and the motion of manipulator arm. Firstly, a coupled dynamic model of the space robot including the motion of the base platform and the robot arm is established. Then, a dual loop control system is designed to control the movement of the robot arm and the attitude of the base platform. In the inner loop, the controller is designed by combining the reinforcement learning and fuzzy theory to control the motion of the end of the robot arm. In the outer loop, the attitude of the base platform is stabilized by PD control. Finally, the proposed control method is used for numerical simulation and compared with the traditional PD control method to verify the effectiveness of the proposed control method. The results show that the robot arm movement process under the control of reinforcement learning is stable and the control precision is high. Compared with the traditional PD control method, it has a certain self-learning ability and is more suitable for the non-cooperative characteristics of the catching targets.
关 键 词:空间机器人 非合作目标 强化学习 机械臂 模糊理论
分 类 号:TP242[自动化与计算机技术—检测技术与自动化装置]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.229