连续状态自适应离散化基于K-均值聚类的强化学习方法被引量：7

Reinforcement Learning Method of Continuous State Adaptively Discretized Based on K-means Clustering

出　　处：《控制与决策》2006年第2期143-147,共5页Control and Decision

基　　金：国家自然科学基金项目(60575033);国家高水平大学985计划项目(KY2701)

摘　　要：使用聚类算法对连续状态空间进行自适应离散化,得到了基于K-均值聚类的强化学习方法.该方法的学习过程分为两部分:对连续状态空间进行自适应离散化的状态空间学习,使用K-均值聚类算法;寻找最优策略的策略学习,使用替代合适迹Sarsa学习算法.对连续状态的强化学习基准问题进行仿真实验,结果表明该方法能实现对连续状态空间的自适应离散化,并最终学习到最优策略.与基于CM AC网络的强化学习方法进行比较,结果表明该方法具有节省存储空间和缩短计算时间的优点.A K-means clustering based reinforcement learning method is proposed, which uses clustering algorithm to adaptively discretize continuous state space. The learning of this method is divided into two processes, state space learning using K -means clustering algorithm for adaptive discretization of continuous states and policy learning using Sarsa algorithm for finding optimal policy. Simulation conducted on reinforcement learning benchmark problem with continuous state shows that the proposed method can adaptively discretize continuous state space and learn optimal policy in the end. Comparison with CMAC network based reinforcement learning method shows that the proposed method has advantages of saving memory and reducing computation time.

关键词：强化学习 K-均值聚类算法 Sarsa学习连续状态自适应离散化

分类号：TP13[自动化与计算机技术—控制理论与控制工程]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

连续状态自适应离散化基于K-均值聚类的强化学习方法被引量：7

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

连续状态自适应离散化基于K-均值聚类的强化学习方法 被引量：7

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

连续状态自适应离散化基于K-均值聚类的强化学习方法被引量：7