基于混合分布式深度强化学习的电力系统暂态稳定紧急控制  

Power System Transient Stability Emergency Control Based on Hybrid Distributed Deep Reinforcement Learning

在线阅读下载全文

作  者:陈一熙 朱继忠 刘嘉媛 黄林莹 CHEN Yixi;ZHU Jizhong;LIU Jiayuan;HUANG Linying(School of Electric Power Engineering,South China University of Technology,Guangzhou 510640,Guangdong Province,China)

机构地区:[1]华南理工大学电力学院,广东省广州市510640

出  处:《电网技术》2025年第4期1513-1523,I0054,I0055,共13页Power System Technology

基  金:国家自然科学基金项目(52177087);广东省基础与应用基础研究基金(2022B1515250006)。

摘  要:“双碳”目标下大规模新能源并网使得电力系统运行方式的时变特性增强,对在线紧急控制策略提出了新的要求。为维持电力系统在受到大扰动后的暂态稳定,提出一种基于混合分布式深度强化学习的在线紧急控制策略。首先,将暂态稳定紧急控制问题建模为马尔可夫决策过程。然后,针对常规深度强化学习算法因离散化混合动作空间所导致的维数灾、精度下降等问题,提出了一种离散-连续混合策略架构,并采用近端策略优化算法作为策略更新方法,实现了对紧急控制问题中混合动作空间的直接处理。接着,针对常规深度强化学习算法训练时间长、鲁棒性不足等弊端,引入分布式并行训练架构,并设计了融入紧急控制先验物理知识的非法动作屏蔽机制,显著提高了算法的训练速度和鲁棒性。最后,通过IEEE 39节点系统验证了所提算法在暂态稳定紧急控制决策中的有效性和优越性。The large-scale integration of renewable energy into the grid under the“dual carbon”goal enhances the time-varying characteristics of power system operation,and puts forward new requirements for online emergency control strategies.To maintain the transient stability of the power system after being subjected to large disturbances,this paper proposes an online emergency control strategy based on hybrid distributed deep reinforcement learning.First,a Markov decision process is adopted to describe the mathematical model of transient stability emergency control.Next,to address the issues of the curse of dimensionality and accuracy dropping caused by the discretization of hybrid action space in conventional deep reinforcement learning algorithms,a discrete-continuous hybrid policy architecture is proposed,and the proximal policy optimization algorithm is introduced as the policy optimizer,achieving direct handling of hybrid action space in emergency control problem.Then,to address the drawbacks of conventional deep reinforcement learning methods,such as long training time and insufficient robustness,a distributed parallel training architecture is introduced,and an invalid action mask combining prior physical knowledge is developed,further improving the training efficiency and robustness.Finally,the test results in the IEEE 39-bus system confirm the effectiveness and superiority of the proposed method in transient stability emergency control.

关 键 词:暂态稳定 紧急控制 深度强化学习 离散-连续混合策略 分布式并行训练 近端策略优化 

分 类 号:TM721[电气工程—电力系统及自动化]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象