基于深度强化学习的配电网多时间尺度在线无功优化  被引量:46

Multi-time-scale Online Optimization for Reactive Power of Distribution Network Based on Deep Reinforcement Learning

在线阅读下载全文

作  者:倪爽 崔承刚 杨宁 陈辉 奚培锋 李振坤 NI Shuang;CUI Chenggang;YANG Ning;CHEN Hui;XI Peifeng;LI Zhenkun(College of Automation Engineering,Shanghai University of Electrical Power,Shanghai 200090,China;Shanghai Key Laboratory of Smart Grid Demand Response,Shanghai 200063,China;College of Electrical Engineering,Shanghai University of Electrical Power,Shanghai 200090,China)

机构地区:[1]上海电力大学自动化工程学院,上海市200090 [2]上海市智能电网需求响应重点实验室,上海市200063 [3]上海电力大学电气工程学院,上海市200090

出  处:《电力系统自动化》2021年第10期77-85,共9页Automation of Electric Power Systems

基  金:国家自然科学基金青年科学基金项目(51607111);上海市科技创新行动计划资助项目(18DZ1203502、19DZ1205700);上海市人民保险局优秀人才基金项目“自动需求响应接口设计与开发”(2017116)的资助。

摘  要:含分布式电源的配电网存在潮流建模不精确、通信条件差、各无功补偿设备难以协调等问题,给配电网在线无功优化带来了挑战。文中采用深度强化学习方法,提出了一种多时间尺度配电网在线无功优化运行方案。该方案将配电网在线无功优化问题转化为马尔可夫决策过程。鉴于不同无功补偿设备的调节速度不同,设计2个时间尺度分别对离散调节设备和连续调节设备进行优化配置。该方案能够实时追踪配电网状态,在线决策无功调节设备的优化方案,且不依赖精确的潮流模型,适用于复杂多变、通信条件差的部分可观测配电网。最后,通过算例验证了所提方法的有效性和鲁棒性。The distribution network with distributed generators has problems such as inaccurate power flow modeling, power communication conditions, and difficulty in coordination of various reactive power compensation equipment. The problems bring challenges to the online optimization for reactive power of the distribution network. This paper proposes a multi-time-scale online optimal operation scheme for reactive power of the distribution network based on the method of deep reinforcement learning(DRL).The scheme converts the problem of the online optimization for reactive power of the distribution network into a Markov decision process(MDP). In view of the different adjustment speeds of different reactive power compensation equipment, two time scales are designed to optimize the configuration of the discrete adjustment equipment and the continuous adjustment equipment. This scheme can track the status of the distribution network in real time, make online decisions about the optimization for reactive power regulation equipment, and does not rely on accurate power flow models. It is suitable for partial observable distribution networks that are complex and changeable and have poor communication conditions. Finally, a numerical example verifies the effectiveness and robustness of the proposed method.

关 键 词:配电网 深度强化学习 马尔可夫决策过程 网络损耗 多时间尺度无功优化 

分 类 号:TM714.3[电气工程—电力系统及自动化] TP18[自动化与计算机技术—控制理论与控制工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象