基于双智能体深度强化学习的交直流配电网经济调度方法被引量：2

Method for Optimal Scheduling of AC/DC Hybrid Distribution Network Based on Double-Agent Deep Reinforcement Learning

作　　者：赵倩宇韩照洋王守相[1,2] 尹孜阳董逸超钱广超 Zhao Qianyu;Han Zhaoyang;Wang Shouxiang;Yin Ziyang;Dong Yichao;Qian Guangchao(Key Laboratory of the Ministry of Education on Smart Power Grids(Tianjin University),Tianjin 300072,China;Tianjin Key Laboratory of Power System Simulation and Control(Tianjin University),Tianjin 300072,China;State Grid Tianjin Electric Power Company,Tianjin 300010,China)

机构地区：[1]教育部智能电网重点实验室(天津大学),天津300072 [2]天津市电力系统仿真控制重点实验室(天津大学),天津300072 [3]国网天津市电力公司,天津300010

出　　处：《天津大学学报（自然科学与工程技术版）》2024年第6期624-632,共9页Journal of Tianjin University：Science and Technology

基　　金：国家自然科学基金资助项目(U2166202);国家电网公司总部科技资助项目(5108-202299256A-1-0-ZB).

摘　　要：随着大量直流电源和负荷的接入,交直流混合的配电网技术已成为未来配电网的发展趋势.然而,源荷不确定性及可调度设备的类型多样化给配电网调度带来了巨大的挑战.本文提出了基于分支决斗深度强化网络(branching dueling Q-network,BDQ)和软演员-评论家(soft actor critic,SAC)双智能体深度强化学习的交直流配电网调度方法.该方法首先将经济调度问题与两智能体的动作、奖励、状态相结合,建立经济调度的马尔可夫决策过程,并分别基于BDQ和SAC方法设置两个智能体,其中,BDQ智能体用于控制配电网中离散动作设备,SAC智能体用于控制连续动作设备.然后,通过集中训练分散执行的方式,两智能体与环境进行交互,进行离线训练.最后,固定智能体的参数,进行在线调度.该方法的优势在于采用双智能体能够同时控制离散动作设备电容器组、载调压变压器和连续动作设备变流器、储能,同时通过对双智能体的集中训练,可以自适应源荷的不确定性.改进的IEEE33节点交直流配电网算例测试验证了所提方法的有效性.With greater access to a large number of DC power sources and loads,distribution networks are increasingly adopting AC/DC hybrid forms.However,the uncertainty of power source and load and the diverse types of dispatchable devices pose a considerable challenge to distribution network scheduling.Therefore,this paper proposes a scheduling method for AC/DC distribution networks using the branching dueling Q-network(BDQ)and soft actorcritic(SAC)double-agent deep reinforcement learning.First,the economic scheduling problem is combined with the actions,rewards,and states of the double agent to establish a Markov decision process for economic scheduling.Two agents are established using the BDQ and SAC methods,respectively;the BDQ agents are used to control discrete action devices in the distribution network,and the SAC agents are used to control continuous action devices.Second,the double agents are trained offline via centralized training and decentralized execution,and the double agent interacts with the environment.Finally,the parameters of the agent are fixed,and online scheduling is performed.The double agent can simultaneously control discrete action device capacitor banks,on-load tap-changing transformers,continuous action device voltage source converters,and energy storage.Additionally,the double agent can self-adapt to source-load uncertainty via centralized training.The improved IEEE33 AC/DC distribution network tests verify the effectiveness of the proposed scheduling method.

关键词：交直流配电网深度强化学习经济调度分支决斗深度强化网络软演员-评论家

分类号：TM734[电气工程—电力系统及自动化]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于双智能体深度强化学习的交直流配电网经济调度方法被引量：2

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于双智能体深度强化学习的交直流配电网经济调度方法 被引量：2

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

基于双智能体深度强化学习的交直流配电网经济调度方法被引量：2