基于策略迭代的离散多智能体系统最优一致性控制研究  被引量:1

Study on optimal consensus control of discrete multi-agent systems based on policy iteration

在线阅读下载全文

作  者:黄垲亮 王莉[1] HUANG Kailiang;WANG Li(School of Computer Science and Engineering,Tianjin University of Technology,Tianjin 300384,China)

机构地区:[1]天津理工大学计算科学与工程学院,天津300384

出  处:《天津理工大学学报》2023年第6期26-33,共8页Journal of Tianjin University of Technology

基  金:国家自然科学基金(61403280,61773286)。

摘  要:对于未知动力学的离散多智能体系统最优一致性控制问题,提出一种新颖的策略迭代算法,采用(hamilton-jacobi-bellman,HJB)方程解决多智能体系统的最优一致性控制问题.在实际应用中,由于多智能体系统具有复杂性和未知性,使完整的动力学模型无法获取,用一般的数学方法无法得到HJB方程的解.为克服这些困难,新方法借助强化学习,只需智能体与邻居间的状态信息误差,即可近似得到最优的控制策略,并给出策略迭代算法的收敛性证明,理论上证明了可解决未知多智能体系统的最优一致性控制问题.通过仿真试验证明:文中算法比传统的策略迭代算法更具有高效性.Based on optimal consensus control problems for a discrete-time multi-agent system with unknown dynamics,a novel policy iteration algorithm is proposed.HJB equation is employed to deal with optimal consensus control problems of the discrete-time multi-agent system with unknown dynamics,and the complete dynamic model is hard to be obtained in real applications because of the complexity and unknowability of the multi-agent system.So solutions of the HJB equation are dificult to be gotten by traditional ways.In order to overcome these difficulties,the method with the help of reinforcement learning,the optimal control can be obtained by only the state information error between the intelligent agent and its neighbor.The convergence proof of the policy iterative algorithm is given.It is theoretical proved that the method can solve optimal consensus control problems of unknown multi-agent systems.Simulation experiments are given to prove the effectiveness of the algorithm and more efficient than the traditional policy iteration algorithm.

关 键 词:多智能体系统 最优一致性控制 策略迭代 强化学习 

分 类 号:TP273.1[自动化与计算机技术—检测技术与自动化装置]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象