基于CMAC网络Sarsa(λ)学习的RoboCup守门员策略

CMAC-based Sarsa(λ) Learning Algorithm for RoboCup-soccer Goalkeeper

出　　处：《北京工业大学学报》2012年第9期1348-1352,共5页Journal of Beijing University of Technology

基　　金：福建省自然科学基金资助项目(2010J05140);高等学校博士学科点专项科研基金资助项目(20100121120022)

摘　　要：针对RoboCup仿真组足球比赛场上状态复杂多变、同时供决策的信息大多为连续变量、智能体利用现有信息通常无法判断当前状态下最优动作的问题,以守门员为例,首先利用CMAC神经网络对连续状态空间泛化,然后在泛化后的状态上,采用Sarsa(λ)学习算法获取守门员的最优策略.通过在RoboCup仿真平台上进行仿真,实验结果表明,采用基于CMAC的Sarsa(λ)学习算法的守门员,经过一定时间的学习后,防守时间显著增长,防守效果明显优于其他算法,验证了本文所提方案的有效性.RoboCup simulated soccer has a large and complex state space, at the same time the variables used for decision are usually continuous, that make it difficult for the agent to choose the optimal action. This paper presents the goalkeeper as a case study, based on CMAC neural network, the continuous state space is firstly generalized, and then the Sarsa （λ） learning algorithm is employed to find the optimal policy. The author empirically evaluated and compared the defending effect of the goalkeepers with different strategies. Simulation results show that the goalkeeper with the learning algorithm has better defending effect and its defending time increases obviously after a period of time.

关键词：RoboCup仿真组足球比赛 CMAC神经网络泛化 Sarsa(λ)学习算法最优策略

分类号：TP181[自动化与计算机技术—控制理论与控制工程]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于CMAC网络Sarsa(λ)学习的RoboCup守门员策略

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于CMAC网络Sarsa(λ)学习的RoboCup守门员策略

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索