性能势算法研究及在RoboCup中的应用  

Research on performance potentials algorithm and its application in RoboCup

在线阅读下载全文

作  者:杨宛璐[1] 陈玮[1] 黄浩晖 王广涛[1] 

机构地区:[1]广东工业大学自动化学院,广东广州510006

出  处:《计算机工程与设计》2014年第3期905-908,共4页Computer Engineering and Design

摘  要:强化学习是人工智能领域中解决学习控制的一种重要方法。在强化学习算法中,平均奖赏强化学习是以平均奖赏值作为参照标准,适用于解决具有循环特性或不具终结状态的问题,其存在参数和环境的敏感及收敛速度慢等问题,并且强调的是单个智能体的独立学习。针对上述问题,考虑单个智能体与其它智能体的关系及影响,将一种改进的基于性能势强化学习算法———G-learning引入到多智能体系统中,设计出一种新的强化学习算法,将新设计的强化学习算法应用在RoboCup的Keepaway平台上。仿真结果表明了在选择较好参考状态的条件下有效提高了强化学习算法在Keepaway平台的性能表现。Reinforcement learning is an important method which is to solve the learning-control in the field of artificial intelli- gence. In reinforcement learning, the average reward reinforcement learning is based on the average reward value as the reference standard. It is more natural and computationally advantageous to formulate tasks so that the controller's objective is to maximize the average payoff received per time step in many problems, for example that the optimal behavior is a limit cycle. However, it has many problems such as oversensitive with parameter and converging slowly. In addition, traditional learning always emphasi- zes the independent learning of a single agent. Considering the relationship between independent learning and group learning, an improved G-learning based on performance potential is proposed which is applied to the multi-agent systems. By using the im- proved algorithm on Keepaway platform, the result of the simulations and experiments show that the new reward function based on some better reference state is better.

关 键 词:足球机器人 强化学习 性能势 G-learning算法 多智能体系统 

分 类 号:TP242.6[自动化与计算机技术—检测技术与自动化装置]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象