异质Agent间的知识迁移强化学习  被引量:1

Knowledge transfer between heterogeneous reinforcement learning agent

在线阅读下载全文

作  者:刘博[1] 雷汝海[1] 

机构地区:[1]中国矿业大学信息与电气工程学院,江苏徐州221116

出  处:《中国科技论文在线》2010年第2期120-123,共4页

摘  要:针对现有知识迁移方法仅适用于同质强化学习Agent的问题,提出一种能够在具有不同状态动作空间的异质Agent间迁移知识的Q学习算法。该算法的主要思想是通过新旧Agent共同学习过的任务,利用神经网络离线学习两Agent间的Q值函数映射关系,利用构造的Q值函数映射器把旧Agent学过而新Agent没有学过的任务的Q值映射到新Agent上,从而可以减少新Agent的学习尝试次数,提高学习速度。10×10格子世界仿真结果验证了所提知识迁移Q学习算法的有效性。Aiming at the problem of the existing knowledge transfer methods are only suitable for homogenous reinforcement learning agents, a kind of Q learning algorithm that can transfer knowledge between heterogeneous Agents with different state and action spaces. The main idea of the proposed Q leaming algorithm can be described as the follows. Based on a task that was already learned by an old and a new Agent, a neural network was used to off-line learn a mapping relationship of Q value function between the two Agents. The constructed mapping of Q value function was then used to obtain Q value of the new Agent in a new task that was already learned by the old Agent while was not learned by the new Agent. The proposed Q learning algorithm can decrease the number of trials of the new Agent and so as to improve learning speed. Simulation results of 10×10 mazes illustrate the validity of the proposed Q learning algorithm.

关 键 词:强化学习 知识迁移 异质Agent Q值 

分 类 号:TP18[自动化与计算机技术—控制理论与控制工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象