Discrete-time dynamic graphical games:model-free reinforcement learning solution  被引量:7

Discrete-time dynamic graphical games:model-free reinforcement learning solution

在线阅读下载全文

作  者:Mohammed I.ABOUHEAF Frank L.LEWIS Magdi S.MAHMOUD Dariusz G.MIKULSKI 

机构地区:[1]Systems Engineering Department, King Fahd University of Petroleum & Mineral [2]UTA Research Institute, University of Texas at Arlington,Fort Worth, Texas, U.S.A. [3]State Key Laboratory of Synthetical Automation for Process Industries, Northeastern University [4]Ground Vehicle Robotics (GVR),U.S. Army TARDEC, Warren MI, U.S.A.

出  处:《Control Theory and Technology》2015年第1期55-69,共15页控制理论与技术(英文版)

基  金:supported by the Deanship of Scientific Research at King Fahd University of Petroleum & Minerals Project(No.JF141002);the National Science Foundation(No.ECCS-1405173);the Office of Naval Research(Nos.N000141310562,N000141410718);the U.S. Army Research Office(No.W911NF-11-D-0001);the National Natural Science Foundation of China(No.61120106011);the Project 111 from the Ministry of Education of China(No.B08015)

摘  要:This paper introduces a model-free reinforcement learning technique that is used to solve a class of dynamic games known as dynamic graphical games. The graphical game results from to make all the agents synchronize to the state of a command multi-agent dynamical systems, where pinning control is used generator or a leader agent. Novel coupled Bellman equations and Hamiltonian functions are developed for the dynamic graphical games. The Hamiltonian mechanics are used to derive the necessary conditions for optimality. The solution for the dynamic graphical game is given in terms of the solution to a set of coupled Hamilton-Jacobi-Bellman equations developed herein. Nash equilibrium solution for the graphical game is given in terms of the solution to the underlying coupled Hamilton-Jacobi-Bellman equations. An online model-free policy iteration algorithm is developed to learn the Nash solution for the dynamic graphical game. This algorithm does not require any knowledge of the agents' dynamics. A proof of convergence for this multi-agent learning algorithm is given under mild assumption about the inter-connectivity properties of the graph. A gradient descent technique with critic network structures is used to implement the policy iteration algorithm to solve the graphical game online in real-time.This paper introduces a model-free reinforcement learning technique that is used to solve a class of dynamic games known as dynamic graphical games. The graphical game results from to make all the agents synchronize to the state of a command multi-agent dynamical systems, where pinning control is used generator or a leader agent. Novel coupled Bellman equations and Hamiltonian functions are developed for the dynamic graphical games. The Hamiltonian mechanics are used to derive the necessary conditions for optimality. The solution for the dynamic graphical game is given in terms of the solution to a set of coupled Hamilton-Jacobi-Bellman equations developed herein. Nash equilibrium solution for the graphical game is given in terms of the solution to the underlying coupled Hamilton-Jacobi-Bellman equations. An online model-free policy iteration algorithm is developed to learn the Nash solution for the dynamic graphical game. This algorithm does not require any knowledge of the agents' dynamics. A proof of convergence for this multi-agent learning algorithm is given under mild assumption about the inter-connectivity properties of the graph. A gradient descent technique with critic network structures is used to implement the policy iteration algorithm to solve the graphical game online in real-time.

关 键 词:Dynamic graphical games Nash equilibrium discrete mechanics optimal control model-free reinforcementlearning policy iteration 

分 类 号:TP18[自动化与计算机技术—控制理论与控制工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象