检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:周东阳 曹军[2] 毕胜山[1] 邵壮[3] 司风琪[3] ZHOU Dongyang;CAO Jun;BI Shengshan;SHAO Zhuang;SI Fengqi(MOE Key Laboratory of Thermo-Fluid Science and Engineering,Xi’an Jiaotong University,Xi’an 710049,China;Xi’an Thermal Engineering Research Institute Co.,Ltd.,Xi’an 710054,China;Key Laboratory of Energy Thermal Conversion and Control,Ministry of Education,Southeast University,Nanjing 210096,China)
机构地区:[1]西安交通大学热流科学与工程教育部重点实验室,西安710049 [2]西安热工研究院有限公司,西安710054 [3]东南大学能源热转换及其过程测控教育部重点实验室,南京210096
出 处:《西安交通大学学报》2022年第8期32-42,共11页Journal of Xi'an Jiaotong University
基 金:国家自然科学基金资助项目(51776171);陕西省重点研发计划资助项目(2020KW-021)。
摘 要:针对现阶段火电机组运行工况频繁波动的情况,为了解决复杂动态过程难以辨识、控制器设定点无法确定的问题,提出了一种基于历史运行数据与强化学习算法的性能最优控制框架。在现有控制器的输出上叠加少量随机噪声,采用均匀化网格算法构建并维护包含典型工况的数据缓冲区,采用基于粒子群优化的连续批量Q学习算法离线求解性能最优控制策略函数。以高压给水加热器控制任务为研究对象,得到了一种无需系统辨识也无需确定设定点即可保持变工况控制品质与换热性能的控制器求解方法。为了验证所提框架的通用性,利用某600 MW机组高压加热器的仿真模型对水位控制过程进行了分析。结果表明,基于强化学习的性能最优控制框架不需要建立系统模型,可以直接利用历史运行数据求解以累积性能最优为目标的控制策略函数,不仅在动态过程中可以达到较好的控制品质,稳态下也能使系统维持在性能较优的状态,相当于同时实现了设定值优化与设定点跟踪控制。In view of the frequent fluctuations in the operating conditions of thermal power units,to solve the problems that the complex dynamic process is difficult to be identified and that the controller setpoint cannot be determined,this paper proposes a performance optimal control framework based on historical operating data and reinforcement learning algorithms.Firstly,a small amount of random noise is superimposed on the output of the existing controller;then a data buffer containing typical operating conditions is built and maintained with the homogenization grid algorithm;and finally the performance optimal control policy function is solved offline with batch Q-learning algorithm based on particle swarm optimization.Focusing on the research into the control task of high-pressure feedwater heaters,this paper proposes a controller solving method that can maintain the control quality and heat transfer performance under variable operating conditions without system identification or setpoint determination.In order to verify the versatility of the performance optimal control framework,the water level control process was analyzed by using the simulation model of a high-pressure heater of a 600 MW unit.The results show that the reinforcement-learning-based performance optimal control framework works without the need of establishing a system model,and can directly solve the control policy function using historical data for the optimal cumulative performance,which not only realizes better control in the dynamic process,but also keeps the system in its optimal performance in a steady state.This method achieves setpoint optimization and setpoint tracking control at the same time.
分 类 号:TK26[动力工程及工程热物理—动力机械及工程]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.137.161.250