基于强化学习的两时间尺度系统最优跟踪控制

Optimal Tracking Control of Two-time-scale Systems Based on Reinforcement Learning

作　　者：邓武丹李庆奎[1] DENG Wudan;LI Qingkui(School of Automation,Beijing Information Science and Technology University)

出　　处：《仪表技术与传感器》2024年第9期92-98,共7页Instrument Technique and Sensor

基　　金：国家重点研发计划项目(2020YFB1708200)。

摘　　要：针对两时间尺度系统的最优跟踪控制问题,提出了一种基于奇异摄动理论与强化学习技术的方法。首先,通过研究奇异摄动理论,将系统分解为快和慢2个子系统,解决了系统存在的奇异摄动参数问题。其次,将系统的跟踪问题分解为慢子系统的线性二次型跟踪(linear quadratic tracking,LQT)问题和快子系统的线性二次型调节(linear quadratic regulator,LQR)问题,进而利用策略Q-学习分别为2个子系统设计控制器求解算法。仿真结果表明所提方法能实现系统的最优跟踪性能。Aiming at the optimal tracking control problem of two-time-scale systems,a method based on singular perturbation theory and reinforcement learning technique was proposed.Firstly,the system was decomposed into fast and slow subsystems based on singular perturbation theory,solving the singular perturbation parameter problem existing in the system.Secondly,the tracking problem of the two-time-scale system was decomposed into the linear quadratic tracking(LQT)problem for the slow subsystem and the linear quadratic regulator(LQR)problem for the fast subsystem.Furthermore,the policy Q-learning was used to design the controller solving algorithms for the two subsystems respectively.Finally,the results show that the proposed method can achieve the optimal tracking performance of the system.

关键词：两时间尺度系统奇异摄动 Q-学习最优跟踪控制

分类号：TP273[自动化与计算机技术—检测技术与自动化装置]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于强化学习的两时间尺度系统最优跟踪控制

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于强化学习的两时间尺度系统最优跟踪控制

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索