基于多智能体强化学习的反应堆功率协调控制方法研究  

Study on Coordinated Control Method of Reactor Power Based on Multi-Agent Reinforcement Learning

在线阅读下载全文

作  者:牛振锋 李桐 李江宽 刘永超 吕为 谭思超 田瑞峰 Niu Zhenfeng;Li Tong;Li Jiangkuan;Liu Yongchao;Lyu Wei;Tan Sichao;Tian Ruifeng(Heilongjiang Provincial Key Laboratory of Nuclear Power System and Equipment,Harbin Engineering University,Harbin,150001,China;Key Laboratory of Nuclear Safety and Advanced Nuclear Energy Technology,Ministry of Industry and Information Technology,Harbin Engineering University,Harbin,150001,China)

机构地区:[1]哈尔滨工程大学黑龙江省核动力装置性能与设备重点实验室,哈尔滨150001 [2]哈尔滨工程大学核安全与先进核能技术工信部重点实验室,哈尔滨150001

出  处:《核动力工程》2025年第2期186-192,共7页Nuclear Power Engineering

基  金:国家自然科学基金(12405200);中央高校基本科研业务费(3072024CFJ1501);黑龙江省省属本科高校“优秀青年教师基础研究支持计划”(KY11500240018)。

摘  要:为提高核电厂反应堆功率与蒸汽发生器水位的协调控制精度,本研究提出了一种基于双延迟深度确定性策略梯度(TD3)算法的多智能体强化学习协调控制框架,在该框架中,不同子任务被分配给相应的智能体,各智能体相互配合以准确协调反应堆功率和蒸汽发生器水位。通过一系列仿真实验,评估了该框架在不同工况下的性能表现,结果表明,多智能体控制框架在多种功率切换工况下显著提高了控制速度和稳定性,其超调量和控制时间均优于传统比例积分微分(PID)控制器,证明了该框架的有效性和优越性;此外,该框架在未经训练的新工况中也表现出优异的泛化能力,能够有效改善反应堆功率的协调控制精度与稳定性。To improve the precision of coordinated control between reactor power and steam generator water levels in nuclear power plants,a multi-agent reinforcement learning coordination control framework based on Twin Delayed Deep Deterministic Policy Gradient(TD3)is proposed in this study,in which various subtasks are assigned to the corresponding agents,and these agents cooperate with each other to accurately coordinate the reactor power and steam generator water levels.Through a series of simulation experiments,the performance of the framework under different operating conditions was evaluated.The experimental results demonstrate that the multiagent control framework significantly improves the control speed and stability under various power switching conditions,with both overshoot and control time outperforming traditional proportional integral differential(PID)controllers.In addition,the framework also shows excellent generalization ability in untrained new conditions,which can effectively improve the precision and stability of coordinated control of reactor power.

关 键 词:RELAP5协调控制 反应堆功率控制 蒸汽发生器水位控制 多智能体强化学习 双延迟深度确定性策略梯度(TD3) 

分 类 号:TL36[核科学技术—核技术及应用]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象