基于安全强化学习的电网稳控策略智能生成方法  被引量:1

Intelligent generation method of power system stability control strategy based on safe reinforcement learning

在线阅读下载全文

作  者:邱建 朱煜昆 张建新[1] 朱益华 徐光虎[1] 涂亮 QIU Jian;ZHU Yukun;ZHANG Jianxin;ZHU Yihua;XU Guanghu;TU Liang(China Southern Power Grid Company Limited,Guangzhou 510663,China;State Key Laboratory of HVDC,Electric Power Research Institute,CSG,Guangzhou 510663,China;Guangdong Provincial Key Laboratory of Intelligent Operation and Control for New Energy Power System,Guangzhou 510663,China)

机构地区:[1]中国南方电网有限责任公司,广东广州510663 [2]直流输电技术全国重点实验室(南方电网科学研究院有限责任公司),广东广州510663 [3]广东省新能源电力系统智能运行与控制企业重点实验室,广东广州510663

出  处:《电力系统保护与控制》2024年第10期147-155,共9页Power System Protection and Control

基  金:南方电网公司重点科技项目资助(000000KK 52210139)。

摘  要:新型电力系统的“双高”趋势改变了电力系统经典稳定特性,导致稳定机理更复杂,系统稳定模式更多样,因此基于典型运行方式的在线稳定控制策略面临挑战。为解决新型电力系统的功角稳定问题,提出了基于安全强化学习的稳控策略智能生成方法。首先,建立了电力系统稳控问题的含约束马尔可夫模型,归纳并提出了紧急控制切机动作涉及的安全约束。其次,为了提高对于电网暂态响应的时空特征提取能力,构建了基于图卷积层和长短期记忆单元的特征感知网络。然后,为了提高稳控策略智能体的训练效率,提出了基于内嵌领域知识约束的近端策略优化算法稳控策略训练框架。最后,在IEEE 39节点系统和某实际电网中进行测试验证。结果表明,所提方法能够根据系统运行状态和故障响应自适应生成切机稳控策略,其决策效果和效率均优于现有的稳控策略。The trend of a“higher proportion of renewable energy and power electronics”in the new power system has changed the classical stability characteristics of the system.The stability mechanism is more complex,and the system stability modes are more diverse.Online stability control strategies based on typical operating modes face a challenge.Considering the rotor angle stability problem of the new power system,an intelligent generation stability control strategy based on safe reinforcement learning is proposed.First,a constrained Markov model for power system stability control problems is established,and the safety constraints involved in rotor angle stability control are summarized and proposed.Secondly,to improve the ability to extract spatial and temporal features of the power grid’s transient response,a feature perception network based on graph convolutional layers and long short-term memory units is constructed.Then,to improve the training efficiency of the stability control agent,a training framework of stability control strategies using proximal policy optimization algorithm based on embedded domain knowledge constraints is proposed.Finally,a case study is performed on the IEEE 39-bus system and a practical power grid.The results show that the proposed method can adaptively generate unit tripping strategies based on the system operating state and fault response,and its decision-making effectiveness and efficiency are superior to existing stability control strategies.This work is supported by the Key Science and Technology Project of China Southern Power Grid Co.,Ltd.(No.000000KK52210139).

关 键 词:稳控策略 安全强化学习 时空特征 领域知识 

分 类 号:TM712[电气工程—电力系统及自动化] TP181[自动化与计算机技术—控制理论与控制工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象