基于深度强化学习的动态装配算法  被引量:2

Dynamic assembly algorithm based on deep reinforcement learning

在线阅读下载全文

作  者:王竣禾 姜勇[1,2] WANG Junhe;JIANG Yong(State Key Laboratory of Robotics,Shenyang Institute of Automation,Chinese Academy of Sciences,Shenyang 110016,China;Institutes for Robotics and Intelligent Manufacturing,Chinese Academy of Sciences,Shenyang 110169,China;University of Chinese Academy of Sciences,Beijing 100049,China)

机构地区:[1]中国科学院沈阳自动化研究所机器人学国家重点实验室,辽宁沈阳110016 [2]中国科学院机器人与智能制造创新研究院,辽宁沈阳110169 [3]中国科学院大学,北京100049

出  处:《智能系统学报》2023年第1期2-11,共10页CAAI Transactions on Intelligent Systems

基  金:国家自然科学基金项目(52075531)。

摘  要:针对动态装配环境中存在的复杂、动态的噪声扰动,提出一种基于深度强化学习的动态装配算法。将一段时间内的接触力作为状态,通过长短时记忆网络进行运动特征提取;定义序列贴现因子,对之前时刻的分奖励进行加权得到当前时刻的奖励值;模型输出的动作为笛卡尔空间位移,使用逆运动学调整机器人到达期望位置。与此同时,提出一种对带有资格迹的时序差分算法改进的神经网络参数更新方法,可缩短模型训练时间。在实验部分,首先在圆孔–轴的简单环境中进行预训练,随后在真实场景下继续训练。实验证明提出的方法可以很好地适应动态装配任务中柔性、动态的装配环境。A dynamic assembly algorithm based on deep reinforcement learning is proposed for complex dynamic noise perturbations in the dynamic assembly environment. Taking the contact force in a period of time as a state, the motion features are extracted through the long short-term memory. Define the sequence discount factor, and obtain the reward value at a certain moment through weighting the sub-reward at the previous moment. The robot can be adjusted to the desired position using inverse kinematics, with the action of model output as the Cartesian space displacement. In the meanwhile, an improved neural network parameter update method is proposed based on the temporal difference(λ) algorithm to shorten the model training time. Experimentally, training was conducted in the real scene upon pre-training in the simple environment with the circular hole-axis. According to the experiments, the proposed algorithm can well adapt to the flexible and dynamic assembly environment in a dynamic assembly task.

关 键 词:柔索模型 动态噪声 动态装配 深度强化学习 长短时记忆网络 序列贴现因子 带有资格迹的时序差分算法 预训练 

分 类 号:TP242.6[自动化与计算机技术—检测技术与自动化装置]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象