基于渐进式神经网络的机器人控制策略迁移被引量：2

Robot control policy transfer based on progressive neural network

作　　者：隋洪建尚伟伟[1] 李想[1] 丛爽[1] SUI Hongjian;SHANG Weiwei;LI Xiang;CONG Shuang((Department of Automation,University of Science and Technology of China, Hefei 230027.)

机构地区：[1]中国科学技术大学自动化系,安徽合肥230027

出　　处：《中国科学技术大学学报》2019年第10期812-819,共8页JUSTC

基　　金：国家自然科学基金(51675501)资助。

摘　　要：在机器人领域,通过深度学习方法来解决复杂的控制任务非常具有吸引力,但是收集足够的机器人运行数据来训练深度学习模型是困难的.为此,提出一种基于渐进式神经网络(progressive neural network,PNN)的迁移算法,该算法基于深度确定性策略梯度(deep deterministic policy gradient,DDPG)框架,通过把模型池中的预训练模型与目标任务的控制模型有机地结合起来,从而完成从源任务到目标任务的控制策略的迁移.两个仿真实验的结果表明,该算法成功地把先前任务中学习到的控制策略迁移到了目标任务的控制模型中.相比于其他基准方法,该算法学习目标任务所需的时间大大减少.In the field of robotic control,it is appealing to solve complicated control tasks through deep learning techniques.However,collecting enough robot operating data to train deep learning models is difficult.Thus,in this paper a transfer approach based on progressive neural network(PNN)and deep deterministic policy gradient(DDPG)is proposed.By linking the current task model and pretrained task models in the model pool with a novel structure,the control strategy in the pretrained task models is transferred to the current task model.Simulation experiments validate that,the proposed approach can successfully transfer control policies learned from the source task to the current task.And compared with other baselines,the proposed approach takes remarkably less time to achieve the same performance in all the experiments.

关键词：机器人控制迁移学习深度强化学习渐进式神经网络

分类号：TP242[自动化与计算机技术—检测技术与自动化装置]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于渐进式神经网络的机器人控制策略迁移被引量：2

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于渐进式神经网络的机器人控制策略迁移 被引量：2

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

基于渐进式神经网络的机器人控制策略迁移被引量：2