Reinforcement learning-based unknown reference tracking control of HMASs with nonidentical communication delays  被引量:2

在线阅读下载全文

作  者:Yong XU Zheng-Guang WU Wei-Wei CHE Deyuan MENG 

机构地区:[1]School of Automation,Beijing Institute of Technology,Beijing 100081,China [2]Institute of Cyber-Systems and Control,Zhejiang University,Hangzhou 310027,China [3]College of Mathematics and Computer Science,Zhejiang Normal University,Jinhua 321004,China [4]Department of Automation,Qingdao University,Qingdao 266071,China [5]School of Automation Science and Electrical Engineering,Beihang University(BUAA),Beijing 100191,China

出  处:《Science China(Information Sciences)》2023年第7期42-53,共12页中国科学(信息科学)(英文版)

基  金:supported by National Natural Science Foundation of China(Grant Nos.62103047,U1966202);Beijing Institute of Technology Research Fund Program for Young Scholars;Young Elite Scientists Sponsorship Program by BAST(Grant No.BYESS2023365)。

摘  要:This paper focuses on the optimal output synchronization control problem of heterogeneous multiagent systems(HMASs)subject to nonidentical communication delays by a reinforcement learning method.Compared with existing studies assuming that the precise model of the leader is globally or distributively accessible to all or some of the followers,the leader’s precise dynamical model is entirely inaccessible to all the followers in this paper.A data-based learning algorithm is first proposed to reconstruct the leader’s unknown system matrix online.A distributed predictor subject to communication delays is further devised to estimate the leader’s state,where interaction delays are allowed to be nonidentical.Then,a learning-based local controller,together with a discounted performance function,is projected to reach the optimal output synchronization.Bellman equations and game algebraic Riccati equations are constructed to learn the optimal solution by developing a model-based reinforcement learning(RL)algorithm online without solving regulator equations,which is followed by a model-free off-policy RL algorithm to relax the requirement of all agents’dynamics faced by the model-based RL algorithm.The optimal tracking control of HMASs subject to unknown leader dynamics and communication delays is shown to be solvable under the proposed RL algorithms.Finally,the effectiveness of theoretical analysis is verified by numerical simulations.

关 键 词:heterogeneous multiagent systems HMAS reinforcement learning RL optimal output synchronization communication delays 

分 类 号:TP18[自动化与计算机技术—控制理论与控制工程] TP273[自动化与计算机技术—控制科学与工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象