检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
机构地区:[1]厦门大学自动化系,厦门361005
出 处:《北京工业大学学报》2012年第9期1348-1352,共5页Journal of Beijing University of Technology
基 金:福建省自然科学基金资助项目(2010J05140);高等学校博士学科点专项科研基金资助项目(20100121120022)
摘 要:针对RoboCup仿真组足球比赛场上状态复杂多变、同时供决策的信息大多为连续变量、智能体利用现有信息通常无法判断当前状态下最优动作的问题,以守门员为例,首先利用CMAC神经网络对连续状态空间泛化,然后在泛化后的状态上,采用Sarsa(λ)学习算法获取守门员的最优策略.通过在RoboCup仿真平台上进行仿真,实验结果表明,采用基于CMAC的Sarsa(λ)学习算法的守门员,经过一定时间的学习后,防守时间显著增长,防守效果明显优于其他算法,验证了本文所提方案的有效性.RoboCup simulated soccer has a large and complex state space, at the same time the variables used for decision are usually continuous, that make it difficult for the agent to choose the optimal action. This paper presents the goalkeeper as a case study, based on CMAC neural network, the continuous state space is firstly generalized, and then the Sarsa (λ) learning algorithm is employed to find the optimal policy. The author empirically evaluated and compared the defending effect of the goalkeepers with different strategies. Simulation results show that the goalkeeper with the learning algorithm has better defending effect and its defending time increases obviously after a period of time.
关 键 词:RoboCup仿真组足球比赛 CMAC神经网络 泛化 Sarsa(λ)学习算法 最优策略
分 类 号:TP181[自动化与计算机技术—控制理论与控制工程]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.28