基于深度确定性策略梯度算法的自适应核反应堆功率控制器设计被引量：3

Design of Self-adaption Nuclear Reactor Power Controller Based on Deep Deterministic Policy Gradient Algorithm

作　　者：刘永超李桐成以恒王博[1,2] 高璞珍谭思超田瑞峰 LIU Yongchao;LI Tong;CHENG Yiheng;WANG Bo;GAO Puzhen;TAN Sichao;TIAN Ruifeng(Heilongjiang Provincial Key Laboratory of Nuclear Power System and Equipment,Harbin Engineering University,Harbin 150001,China;Key Laboratory of Nuclear Safety and Advanced Nuclear Energy Technology,Harbin Engineering University,Harbin 150001,China)

机构地区：[1]哈尔滨工程大学黑龙江省核动力装置性能与设备重点实验室,黑龙江哈尔滨150001 [2]哈尔滨工程大学核安全与先进核能技术工信部重点实验室,黑龙江哈尔滨150001

出　　处：《原子能科学技术》2024年第5期1076-1083,共8页Atomic Energy Science and Technology

基　　金：中核集团领创项目(CNNC-LCKY-202245,CNNC-LCKY-202251);中央高校基本科研业务费项目(3072022JC2401)。

摘　　要：核电厂需要大量控制系统来实现系统有效控制与安全运行,其中核电站堆芯是放射性核燃料热源的关键部件,反应堆功率控制关系到核电厂运行的安全性与经济性。为解决传统PID控制器难以准确应对非线性、大功率范围的功率控制问题,本研究以某压水堆核电厂为对象推导并建立了反应堆堆芯模型,采用基于策略梯度的深度强化学习方法与PID控制器结合建立的自适应控制器进行功率控制仿真。仿真结果表明:相较于传统PID控制器,所设计的基于深度确定性策略梯度算法的自适应功率控制器,响应速度更快、控制精度与稳定性更高,同时具有较高的鲁棒性,可以准确快速地控制堆芯功率,跟踪负荷变化。Nuclear power plants need a large number of control systems to achieve effective control and safe operation of the system,in which nuclear power plant core is the key component of radioactive nuclear fuel heat source,and reactor power control is related to the safety and economy of nuclear power plant operation.Therefore,it is of great significance to optimize the design of nuclear reactor power controller.In the controller design stage,the control parameters of PID controller will be fixed in advance,which makes the control effect of PID controller has a certain degree of optimization space.In order to solve the problem that traditional PID controller is difficult to accurately deal with the nonlinear power control in the high power range,this study derived and established a reactor core model for a pressurized water reactor nuclear power plant.The core model includes heat transfer equation,neutron dynamics equation and reactivity equation.In this study,an adaptive controller based on deep reinforcement learning based on policy gradient(deep deterministic policy gradient algorithm)combined with PID(proportional integral derivative)controller was used to simulate power control,and a reward function was constructed.The reward function can be used to represent the optimization of several control evaluation indexes such as response time,threat time,control accuracy,overshoot and oscillation.The depth deterministic policy gradient algorithm can realize real-time optimization policy learning of PID controller control parameters by interacting with core model in real time.After several groups of working conditions with different power levels and different power switching modes were tested.The simulation results show that:In the 100%FP-90%FP step power reduction process(training condition),compared with the traditional PID controller,the self-adaption power controller designed based on the depth deterministic policy gradient algorithm has faster response speed,higher control accuracy and stability.At the same time,under

关键词：功率控制强化学习深度学习自适应控制器

分类号：TL36[核科学技术—核技术及应用]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于深度确定性策略梯度算法的自适应核反应堆功率控制器设计被引量：3

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于深度确定性策略梯度算法的自适应核反应堆功率控制器设计 被引量：3

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

基于深度确定性策略梯度算法的自适应核反应堆功率控制器设计被引量：3