基于多智能体强化学习的新强化函数设计被引量：4

A Reward Function Based on Reinforcement Learning of Multi-agent

出　　处：《控制工程》2009年第2期239-242,共4页Control Engineering of China

基　　金：北京市教委科技重点发展基金资助项目(EM200610005019);北京工业大学博士科研启动基金资助项目(52002011200708)

摘　　要：为了提高强化学习算法在多智能体系统中的性能表现,针对典型的多智能体系统-Keepaway平台总是以失败告终的特点,受与之有相同特点的单智能体系统杆平衡系统所采用强化函数的启发,重新设计一种新的惩罚式的强化函数。新的强化函数在系统成功状态时设零值奖赏,失败状态时给与负值惩罚。基于新设计的强化函数的Sarsa(λ)算法成功应用在Keepaway平台上。仿真结果表明,新设计的强化函数在一定参数条件下有效提高了强化学习算法载Keepaway平台的性能表现,其最终的学习效果更好。To improve the performance of the reinforcement learning method on multi-agent systems, thinking about the characteristic of Keepaway that always ended with failure, based on the reference of the reward function design pattern in the pole-balance system, a new punitive reward function is redesigned. The values of the reward function are zeroes when the system is at successful states, and the values are negatives when the system is at failed states. Sarsa（λ） algorithm based on the new reward function are successfully used on the Keepaway. The simulation results show that the new reward function based on some parameters is better, and improves the performance of the reinforcement learning effectively.

关键词：Keepaway 多智能体系统强化学习强化函数 ROBOCUP

分类号：TP181[自动化与计算机技术—控制理论与控制工程]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于多智能体强化学习的新强化函数设计被引量：4

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于多智能体强化学习的新强化函数设计 被引量：4

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

基于多智能体强化学习的新强化函数设计被引量：4