基于联合强化学习的RoboCup-2D传球策略

RoboCup-2D passing strategy based on joint reinforcement learning

作　　者：常晓军[1]

出　　处：《计算机工程与应用》2011年第23期212-216,219,共6页Computer Engineering and Applications

摘　　要：在传统Q学习算法基础上引入多智能体系统,提出了多智能体联合Q学习算法。该算法是在同一评价函数下进行多智能体的学习,并且学习过程考虑了参与协作的所有智能体的学习结果。在RoboCup-2D足球仿真比赛中通过引入球场状态分解法减少了状态分量,采用联合学习得到的最优状态作为多智能体协作的最优动作组,有效解决了仿真中各智能体之间的传球策略及其协作问题,仿真和实验结果证明了算法的有效性和可靠性。A combined Q-learning algorithm of Multi-Agent System（MAS） is proposed on the basis of the traditional Q-learning algorithm.Multi-agent learning is performed under the same evaluation function.While learning results of all the agents which participate in collaboration are taken into account during the learning process.The pitch components of state are reduced by introducing a state of decomposition method in RoboCup-2D soccer simulation game.The optimal state obtained by joint learning is adopted as the optimal action group of collaborative multi-agent.The problems of passing strategy and cooperation between all agents in the simulation are effective solved.The results of simulation and experiments demonstrate the validity and reliability of the proposed algorithm.

关键词：多智能体系统联合Q学习算法 RoboCup-2D 球场状态分解法

分类号：TP242.6[自动化与计算机技术—检测技术与自动化装置]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于联合强化学习的RoboCup-2D传球策略

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于联合强化学习的RoboCup-2D传球策略

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索