基于近端策略优化算法的新能源电力系统安全约束经济调度方法  被引量:16

Security-constrained Economic Dispatch of Renewable Energy Integrated Power Systems Based on Proximal Policy Optimization Algorithm

在线阅读下载全文

作  者:杨志学 任洲洋[1] 孙志媛 刘默斯 姜晶[3] 印月[4] YANG Zhixue;REN Zhouyang;SUN Zhiyuan;LIU Mosi;JIANG Jing;YIN Yue(StateKey Laboratory of Power Transmission Equipment&System Security andNew Technology(Chongqing University),Shapingba District,Chongqing 400044,China;Electric Power Research Institute of Guangxi Power Grid Co.,Ltd.,Nanning 530000,Guangxi Zhuang Autonomous Region,China;Department of Electrical and Computer Engineering,University of Western Ontario,London Ontario N6A 5B9,Canada;College of Electrical Engineering,Sichuan University,Chengdu 610065,Sichuan Province,China)

机构地区:[1]输配电装备及系统安全与新技术国家重点实验室(重庆大学),重庆市沙坪坝区400044 [2]广西电网有限责任公司电力科学研究院,广西壮族自治区南宁市530000 [3]西安大略大学电气与计算机工程系,加拿大安大略省伦敦N6A5B9 [4]四川大学电气工程学院,四川省成都市610065

出  处:《电网技术》2023年第3期988-997,共10页Power System Technology

基  金:国家自然科学基金项目(52277080);四川省科技厅国际/港澳台科技创新合作项目(2022YFH0018)。

摘  要:针对高比例新能源接入导致电力系统安全约束经济调度难以高效求解的问题,该文提出了一种基于近端策略优化算法的安全约束经济调度方法。首先,建立了新能源电力系统安全约束经济调度模型。在深度强化学习框架下,定义了该模型的马尔科夫奖励过程。设计了近端策略优化算法的奖励函数机制,引导智能体高效生成满足交流潮流以及N-1安全约束的调度计划。然后,设计了调度模型与近端策略优化算法的融合机制,建立了调度训练样本的生成与提取方法以及价值网络和策略网络的训练机制。最后,采用IEEE 30节点和IEEE 118节点2个标准测试系统,验证了本文提出方法的有效性和适应性。To efficiently solve the security constrained economic dispatch problem in a high-proportional renewable energy integrated power system, a security-constrained economic dispatch based on a proximal policy optimization algorithm is proposed. First, a dispatch model of the power system is established based on the AC power flow. The Markov reward process of the dispatch model under the framework of deep reinforcement learning is developed. Subsequently, the reward function mechanism of the proximal policy optimization algorithm is designed to guide the agents to generate a dispatching plan that satisfies both the power flow requirements and the N-1 security constraints. Next, the incorporating mechanism of the dispatching model with the proximal policy optimization algorithm is figured out, establishing a generation and extraction of the training samples as well as the training mechanism for the value network and the policy network. Finally, the effectiveness and adaptability of the proposed method are validated by using the standard IEEE 30-node and IEEE 118-node test systems.

关 键 词:安全约束经济调度 深度强化学习 近端策略优化算法 新能源 

分 类 号:TM721[电气工程—电力系统及自动化]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象