一种深度强化学习的机械臂控制方法  被引量:5

Robot Arm Control Method of Deep Reinforcement Learning

在线阅读下载全文

作  者:姬周珂 徐巧玉[1] 王军委 李坤鹏 JI Zhouke;XU Qiaoyu;WANG Junwei;LI Kunpeng(Mechatronics Engineering School,Henan University of Science&Technology,Luoyang 471003,China;Luoyang GINGKO Technology Co.,Ltd.,Luoyang 471003,China)

机构地区:[1]河南科技大学机电工程学院,河南洛阳471003 [2]洛阳银杏科技有限公司,河南洛阳471003

出  处:《河南科技大学学报(自然科学版)》2021年第3期19-24,M0003,共7页Journal of Henan University of Science And Technology:Natural Science

基  金:国家自然科学基金项目(51205108);河南省高等学校重点科研基金项目(15A535001)。

摘  要:针对工业液压机械臂末端控制精度受惯性和摩擦等因素影响的问题,提出了一种基于深度强化学习的机械臂控制方法。首先,在机器人操作系统环境下搭建仿真机械臂并进行控制和通信模块设计。然后,对深度确定性策略梯度(DDPG)算法中的Actor-Critic网络进行设计,并基于机械臂逆运动学与深度强化学习奖励机制,设计了一种包含精度指标的分层奖励函数,促进DDPG算法收敛。最后,采用改进的DDPG算法与仿真机械臂交互训练,获得机械臂控制模型,从而实现对机械臂末端的精确控制。试验结果表明:改进的DDPG算法收敛速度提升了约14.54%,在仿真环境下机械臂可以达到6 mm的末端位置控制精度,多点测试完成率最高达到90%。Aiming at the problem that the control accuracy of the end of industrial hydraulic robot arm was affected by inertia,friction and other factors,the robot arm control method based on deep reinforcement learning was proposed.Firstly,the simulated robot arm was built under the robot operating system(ROS)environment.The control and communication module of the robot arm were designed.Then the Actor-Critic network in the deep deterministic policy gradient(DDPG)algorithm was designed.Based on the inverse kinematics of the robot arm and the reward mechanism of deep reinforcement learning,a hierarchical reward function containing accuracy index was designed to promote the convergence of DDPG algorithm.At last,the improved DDPG algorithm was used for interactive training with the simulated robotic arm to obtain the control model of the robot arm,so as to effectively realize accurate control of the end of the robot arm.The experimental results show that the convergence speed of the improved DDPG algorithm is improved by 14.54%.The robot arm achieves the end position control accuracy of 6 mm under the simulation environment,and the multipoint test completion rate reaches to 90%.

关 键 词:机械臂 深度强化学习 DDPG 控制精度 

分 类 号:TP241[自动化与计算机技术—检测技术与自动化装置]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象