异质Agent间的知识迁移强化学习被引量：1

Knowledge transfer between heterogeneous reinforcement learning agent

出　　处：《中国科技论文在线》2010年第2期120-123,共4页

摘　　要：针对现有知识迁移方法仅适用于同质强化学习Agent的问题,提出一种能够在具有不同状态动作空间的异质Agent间迁移知识的Q学习算法。该算法的主要思想是通过新旧Agent共同学习过的任务,利用神经网络离线学习两Agent间的Q值函数映射关系,利用构造的Q值函数映射器把旧Agent学过而新Agent没有学过的任务的Q值映射到新Agent上,从而可以减少新Agent的学习尝试次数,提高学习速度。10×10格子世界仿真结果验证了所提知识迁移Q学习算法的有效性。Aiming at the problem of the existing knowledge transfer methods are only suitable for homogenous reinforcement learning agents, a kind of Q learning algorithm that can transfer knowledge between heterogeneous Agents with different state and action spaces. The main idea of the proposed Q leaming algorithm can be described as the follows. Based on a task that was already learned by an old and a new Agent, a neural network was used to off-line learn a mapping relationship of Q value function between the two Agents. The constructed mapping of Q value function was then used to obtain Q value of the new Agent in a new task that was already learned by the old Agent while was not learned by the new Agent. The proposed Q learning algorithm can decrease the number of trials of the new Agent and so as to improve learning speed. Simulation results of 10×10 mazes illustrate the validity of the proposed Q learning algorithm.

关键词：强化学习知识迁移异质Agent Q值

分类号：TP18[自动化与计算机技术—控制理论与控制工程]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

异质Agent间的知识迁移强化学习被引量：1

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

异质Agent间的知识迁移强化学习 被引量：1

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

异质Agent间的知识迁移强化学习被引量：1