基于近似Q-学习算法的数据驱动控制仿真  被引量:2

Data Driven Control Simulation Based on Approximate Q-Learning Algorithm

在线阅读下载全文

作  者:于子航 王改云[1] YU Zi-hang;WANG Gai-yun(School of Electrical Engineering and Automation,Guilin University of Electronic Technology,Guilin Guangxi 541000,China)

机构地区:[1]桂林电子科技大学花江校区电子工程与自动化学院,广西桂林541000

出  处:《计算机仿真》2022年第5期344-347,379,共5页Computer Simulation

摘  要:为解决依赖受控系统数学模型而导致的数据驱动控制性能不完善,控制跟踪结果误差较大的问题,提出一种基于近似Q-学习算法的数据驱动控制方法。为使Q被充分学习,在时间轴的立即回报序列上估计训练值,因此修改确定性规则,使其采用当前Q值和修正后估计衰减值计算加权平均值计算,实现算法收敛。采用以Q-学习算法为结构的控制器代替受控数据当前工作点的一般非线性模型,并且仅使用被控对象提供的数据来评估模型中的伪偏导函数,实现无模型数据驱动控制。仿真结果证明,所提方法的信号扰动较小,且控制跟踪结果误差较小,整体性能要优于传统方法。A data-driven control method based on approximate Q-learning algorithm is proposed to solve the problems of imperfect data-driven control performance and large error of control tracking results caused by relying on the mathematical model of the controlled system.In order to make Q be fully learned,the training value was estimated on the immediate return sequence of the time axis,so that the deterministic rule was modified.Then,the weighted average value was calculated based on current Q value and the estimated attenuation value after the correction,so that the convergence of algorithm was achieved.The controller based on Q-learning algorithm was used to replace the general nonlinear model of the current working point of the controlled data.Meanwhile,only the data provided by the controlled object were used to evaluate the pseudo partial derivative in model.Finally,the data-driven control without model was completed.Simulation results show that the proposed method has smaller signal disturbance and smaller error in control and track,so its overall performance is better than the traditional method.

关 键 词:数据驱动控制 衰减值估计 非线性离散模型 伪偏导函数 

分 类 号:TP472[自动化与计算机技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象