混合多Agent环境下动态策略强化学习算法  被引量:1

Reinforcement Learning Algorithm for Dynamic Policy Under Mixed Multi-agent Domains

在线阅读下载全文

作  者:肖正[1] 何青松[1] 张世永[1] 

机构地区:[1]复旦大学计算机与信息技术系,上海200433

出  处:《小型微型计算机系统》2009年第7期1268-1273,共6页Journal of Chinese Computer Systems

基  金:国家重点基础研究发展"九七三"计划项目(2005CB321906)资助

摘  要:机器学习在多Agent系统的协作和行为决策中得到广泛关注和深入研究.分析基于均衡解和最佳响应的学习算法,提出了两个混合多Agent环境下动态策略的强化学习算法.该算法不仅能适应系统中其他Agent的行为策略和变化,而且能利用过去的行为历史制定更为准确的时间相关的行为策略.基于两个知名零和博弈,验证了该算法的收敛性和理性,在与最佳响应Agent的重复博弈中能获得更高的收益.Recently machine learning is paid much attention to and researched more deeply in collaboration and action selection of multi-agent systems. In this paper we analyzed equilibrium based and best response based learning algorithms, and proposed two reinforcement learning algorithms for dynamic policy under mixed multi-agent domains. These algorithms not only can adapt to policy and its variation of other agents, but also can make out more accurate time-related policy using past behavior history. Based on two well-known zero-sum games, convergence and rationality of this algorithm is validated, and it can receive higher utility in repeated games against best response based agents.

关 键 词:多AGENT系统 行为选择 动态策略 强化学习 

分 类 号:TP18[自动化与计算机技术—控制理论与控制工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象