基于深度强化学习的蒸汽发生器水位控制  被引量:3

Water level control of a steam generator based on deep reinforcement learning

在线阅读下载全文

作  者:张汲宇 夏虹[1,2] 彭彬森 王志超 姜莹莹[1,2] ZHANG Jiyu;XIA Hong;PENG Binsen;WANG Zhichao;JIANG Yingying(Key Laboratory of Nuclear Safety and Advanced Nuclear Energy Technology, Ministry of Industry and Information Technology, Harbin Engineering University, Harbin 150001, China;Fundamental Science on Nuclear Safety and Simulation Technology Laboratory, Harbin Engineering University, Harbin 150001, China)

机构地区:[1]哈尔滨工程大学核安全与先进核能技术工信部重点实验室,黑龙江哈尔滨150001 [2]哈尔滨工程大学核安全与仿真技术国防重点学科实验室,黑龙江哈尔滨150001

出  处:《哈尔滨工程大学学报》2021年第12期1754-1761,共8页Journal of Harbin Engineering University

基  金:核总研究院领创基金项目(G4821020).

摘  要:针对蒸汽发生器精确建模困难和低工况下控制性能差的问题,本文提出了一种基于深度强化学习优化的智能分层(IH)控制器。使用串级PI控制器作为初级控制器,用于直接控制水位。高级控制器采用经过深度强化学习优化的智能体控制器,负责实时对串级PI的参数进行优化,以便获得更好的控制性能。在高级控制器智能体的训练过程中,通过构建状态信息和奖励函数并采用深度残差神经网络逼近作为Q函数和策略函数的逼近器,获得了较好的泛化性能。结果表明:在不同的功率水平下,智能分层方法不仅对蒸汽发生器水位控制具有良好的跟踪能力,而且还具有很好的抗干扰能力。通过仿真验证了控制器的有效性。In this paper,an intelligent hierarchical(IH)controller based on deep reinforcement learning is proposed to address difficulties in accurate modeling of steam generators and poor control performance under low operating conditions.The cascade proportional integral(PI)controller is used as the primary controller to directly control the water level,while the intelligent controller optimized by deep reinforcement learning is used as the advanced controller,which is responsible for optimizing the parameters of the cascade PI in real time to obtain good control performance.In the training process of the advanced controller agent,by constructing new state information and reward function and by adopting deep residual neural network approximation as the approximator of the Q function and strategy function,a good generalization performance is obtained.The results show that under different power levels,the IH method not only has a good tracking ability for the steam generator water level control but also a good anti-interference ability.The effectiveness of the controllers is verified through simulation experiments.

关 键 词:蒸汽发生器 深度强化学习 深度确定性策略梯度 水位控制 状态信息 奖励函数 评价网络 动作网络 

分 类 号:TP273[自动化与计算机技术—检测技术与自动化装置]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象