检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:Mingming Ha Ding Wang Derong Liu
机构地区:[1]School of Automation and Electrical Engineering,University of Science and Technology Beijing,Beijing 100083,China [2]Faculty of Information Technology,the Beijing Key Laboratory of Computational Intelligence and Intelligent System,the Beijing Laboratory of Smart Environmental Protection,and the Beijing Institute of Artificial Intelligence,Beijing University of Technology,Beijing 100124,China [3]Department of Electrical and Computer Engineering,University of Illinois at Chicago,Chicago IL 60607 USA [4]IEEE
出 处:《IEEE/CAA Journal of Automatica Sinica》2022年第7期1262-1272,共11页自动化学报(英文版)
基 金:This work was supported in part by Beijing Natural Science Foundation(JQ19013);the National Key Research and Development Program of China(2021ZD0112302);the National Natural Science Foundation of China(61773373).
摘 要:The core task of tracking control is to make the controlled plant track a desired trajectory.The traditional performance index used in previous studies cannot eliminate completely the tracking error as the number of time steps increases.In this paper,a new cost function is introduced to develop the value-iteration-based adaptive critic framework to solve the tracking control problem.Unlike the regulator problem,the iterative value function of tracking control problem cannot be regarded as a Lyapunov function.A novel stability analysis method is developed to guarantee that the tracking error converges to zero.The discounted iterative scheme under the new cost function for the special case of linear systems is elaborated.Finally,the tracking performance of the present scheme is demonstrated by numerical results and compared with those of the traditional approaches.
关 键 词:Adaptive critic design adaptive dynamic programming(ADP) approximate dynamic programming discrete-time nonlinear systems reinforcement learning stability analysis tracking control value iteration(VI)
分 类 号:TP273[自动化与计算机技术—检测技术与自动化装置]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.4