一种基于无限时域无模型的在线Q学习算法

An Online Q-Learning Algorithm for a Model-Free Infinite Horizon System

作　　者：代晓清赵旭 DAI Xiaoqing;ZHAO Xu(School of Computer Science,Chengdu Normal University,Chengdu 611000,China;School of Computer and Software,Nanjing University of Information Technology,Nanjing 210000,China)

机构地区：[1]成都师范学院计算机科学学院,成都611000 [2]南京信息工程大学计算机与软件学院,南京210000

出　　处：《电光与控制》2022年第2期53-57,共5页Electronics Optics & Control

基　　金：四川省科技厅重点研发计划项目(20ZDYF2386)。

摘　　要：针对连续线性系统的无限时域最优控制的在线实现问题,在系统动态完全未知的条件下设计了一种在线Q学习算法。基于无限时域最优控制理论中的哈密顿函数与最优代价函数,构造了连续线性系统的Q函数。采用积分强化学习方法设计了一个Actor/Critic逼近器结构,在保证闭环渐近稳定性和最优解收敛的同时,在线估计Q函数的参数。考虑涡轮增压发动机的6阶线性系统模型进行了数字仿真,结果表明,Critic权重与Actor权重均渐近收敛于最优值,实现了无模型的最优控制。For the online implementation of infinite horizon optimal control for continuous linear systems,an online Q-learning algorithm is designed under the condition that the system dynamics are completely unknown.Based on the Hamiltonian function and the optimal cost function in the infinite horizon optimal control theory,the Q function of the continuous linear system is constructed.An Actor/Critic approximator structure is designed by using the integral reinforcement learning method.With asymptotic stability of the closed-loop system and convergence to the optimal solution,the parameters of the Q function are estimated online.The 6 th-order linear system model of the turbocharged engine is numerically simulated,and the results show that both the Critic weight and the Actor weight asymptotically converge to the optimal value, and the model-free optimal control is realized.

关键词：最优控制哈密顿函数 Q学习 Actor/Critic逼近器

分类号：TP249[自动化与计算机技术—检测技术与自动化装置]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

一种基于无限时域无模型的在线Q学习算法

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

一种基于无限时域无模型的在线Q学习算法

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索