检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:ZHANG Bao-Qiang WANG Bing-Chang CAO Ying
机构地区:[1]School of Control Science and Engineering,Shandong University,Jinan 250000,China
出 处:《Journal of Systems Science & Complexity》2024年第5期1907-1922,共16页系统科学与复杂性学报(英文版)
基 金:supported in part by the National Natural Science Foundation of China under Grant Nos.62122043,62192753;in part by Natural Science Foundation of Shandong Province for Distinguished Young Scholars under Grant No.ZR2022JQ31;in part by the Innovative Research Groups of the National Natural Science Foundation of China under Grant No.61821004.
摘 要:In this paper,the authors design a reinforcement learning algorithm to solve the adaptive linear-quadratic stochastic n-players non-zero sum differential game with completely unknown dynamics.For each player,a critic network is used to estimate the Q-function,and an actor network is used to estimate the control input.A model-free online Q-learning algorithm is obtained for solving this kind of problems.It is proved that under some mild conditions the system state and weight estimation errors can be uniformly ultimately bounded.A simulation with five players is given to verify the effectiveness of the algorithm.
关 键 词:Actor-critic algorithm model-free adaptive control nonzero-sum stochastic game reinforcement learning
分 类 号:O225[理学—运筹学与控制论] TP181[理学—数学]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.222