配电网持续无功优化的深度强化学习方法  被引量:48

Continuous Reactive Power Optimization of Distribution Network Using Deep Reinforcement Learning

在线阅读下载全文

作  者:李琦 乔颖[2] 张宇精 LI Qi;QIAO Ying(ZHANG Yujing2(1.Dispatching Center,State Grid Shaanxi Power Grid,Xi’an 710054,Shaanxi Province,China;Department of Electrical Engineering,Tsinghua University,Haidian District,Beijing 100084,China)

机构地区:[1]国网陕西省电力公司电力调度中心,陕西省西安市710054 [2]清华大学电机工程与应用电子技术系,北京市海淀区100084

出  处:《电网技术》2020年第4期1473-1480,共8页Power System Technology

基  金:国网陕西省电力公司科技项目“基于新能源逆变器群控调相技术的特高压交直流电网送端电压安全问题研究”。

摘  要:高渗透率分布式光伏的出力波动可能导致配电网电压波动大、网损提高和电容器投切需求频繁。但配电网节点监控覆盖率低、潮流建模难度大,需要在上述不利条件下实现对台区内持续电压无功优化。采用深度强化学习的方法提出了适用于低感知度配电网的连续无功优化方法。该方法将原问题转化为一个多步马尔科夫决策过程,以最小化网损和动作成本之和为优化目标,以离散无功调节设备的投切指令为控制变量,并采用基于行动者–评论家(actor-critic)的深度强化学习算法进行求解。针对配电网缺乏完整潮流模型和观测数据的特点,分别设计了用来拟合投切策略的Actor网络和用来拟合动作价值函数的Ctritic网络。所提方法用深度神经网络直接拟合系统状态到离散无功调节设备的投切动作的函数关系,在与实际配电网的交互过程中完成网络训练。相比传统方法,该无需潮流建模和分段决策,且不依赖于日前的负荷与分布式电源出力预测,可以实现在线的多时间断面下的连续无功优化,提高了系统运行经济性。Power fluctuation of high penetration of distributed photovoltaic generation could result in variable voltage, increased net loss, and frequent capacitor action of distributed power grid. However, for there is no complete node observation data and power flow model, it is essential to optimize reactive power continuously in the above unfavorable situations. This paper proposes a continuous reactive power optimization suitable for low awareness using the deep reinforcement learning(DRL). This optimization converts the above problem into a multiple step Markov decision process that optimizes discrete reactive regulation orders to minimize the net loss and action cost, and then to solve the problem using an actor-critic-DRL-based algorithm. Aiming at incomplete power flow model and observation data, an actor network is designed to fit the device action strategies and a critic net to evaluate action value. The neural network is applied to directly fit the functional relationship of the system states to the device actions, and to accomplish the DRL network training in the process of interaction with the actual distribution network. Compared with the traditional methods, the proposed model is free of power flow model and interval division, and does not depend on the load and distributed generation prediction. It can realize online continuous reactive power optimization and improve the system operation economy.

关 键 词:低感知度配电网 无功优化 强化学习 分布式光伏 数据驱动 神经网络 

分 类 号:TM721[电气工程—电力系统及自动化]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象