U-Clustering:基于效用聚类的激励学习算法

U-Clustering: A Reinforcement Learning Algorithm Based on Utility Clustering

出　　处：《计算机工程与应用》2005年第26期37-42,74,共7页Computer Engineering and Applications

基　　金：国家自然科学基金(编号:60075019)资助

摘　　要：提出了一个新的效用聚类激励学习算法U-Clustering。该算法完全不用像U-Tree算法那样进行边缘节点的生成和测试,它首先根据实例链的观测动作值对实例进行聚类,然后对每个聚类进行特征选择,最后再进行特征压缩,经过压缩后的新特征就成为新的状态空间树节点。通过对NewYorkDriving[2,13]的仿真和算法的实验分析,表明U-Clustering算法对解决大型部分可观测环境问题是比较有效的算法。That presented in this paper is a new utility clustering based reinforcement learning algorithm called U-Clustering.Unlike the U-Tree,it does not use fringe and related statistical test at all.The U-Clustering algorithm groups the instances that have matching history up to a certain length into a cluster based on the observation-action value of them,and makes the feature selecting and feature compressing for each cluster.The new features become new nodes in the agent＇s internal state space tree.Experimental results in a difficult partially observable driving task called New York Driving show that the U-Clustering algorithm is an effective one for solving the large partially observable problems.

关键词：激励学习效用聚类部分可观测Markov决策过程

分类号：TP18[自动化与计算机技术—控制理论与控制工程]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

U-Clustering:基于效用聚类的激励学习算法

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

U-Clustering:基于效用聚类的激励学习算法

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索