检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:李晓旭 田猛[1] 朱紫阳 董政呈 龚立 王先培[1] LI Xiaoxu;TIAN Meng;ZHU Ziyang;DONG Zhengcheng;GONG Li;WANG Xianpei(Electronic Information School,Wuhan University,Wuhan 430072,China;School of Automation,Wuhan University of Technology,Wuhan 430072,China)
机构地区:[1]武汉大学电子信息学院,武汉430072 [2]武汉理工大学自动化学院,武汉430072
出 处:《高电压技术》2023年第6期2329-2338,共10页High Voltage Engineering
基 金:国家自然科学基金(52177109);湖北省重点研发计划(2020BAB109)。
摘 要:传统深度强化学习在优化配网潮流时易受传感器观测误差等干扰,鲁棒性较差。对此,提出一种基于鲁棒强化学习的配网潮流优化方法。首先以最小化配网网损为目标,电压、潮流越限为安全约束,建立包含分布式发电、储能及负荷单元的配网潮流优化模型。然后将干扰建模为攻击智能体,对配网潮流优化主智能体的观测状态施加扰动,构建双智能体零和博弈鲁棒强化学习模型。最后提出一种双智能体-拉格朗日乘子-信任区域策略优化算法,配网潮流优化主智能体与攻击智能体同步训练、异步学习,相互对抗博弈。仿真结果表明,通过该方法训练的配网潮流优化智能体,能在不同类型的干扰下做出安全决策,提高了配网潮流优化的鲁棒性和安全性。Traditional deep reinforcement learning is prone to interference such as sensor observation errors when optimizing distribution network power flow,and its robustness is poor.Therefore,a distribution network power flow optimization method based on robust reinforcement learning is proposed.Firstly,a distribution network power flow optimization model including distributed generation,energy storage,and load cells is established with the goal of minimizing distribution network power losses and the security constraints of voltage and power flow exceeding limits.Then,the interference is modeled as an attack agent,which perturbs the observed state of the main agent for distribution network power flow optimization,thus a robust reinforcement learning model for a dual agent zero sum game is constructed.Finally,a dual agent Lagrange multiplier trust region strategy optimization algorithm is proposed,in which the main agent and the attacking agent of distribution network power flow optimization can train synchronously,learn asynchronously,and play games against each other.The simulation results show that the distribution network power flow optimization agent trained by this method can make security decisions under different types of interference,improving the robustness and security of distribution network power flow optimization.
关 键 词:配网潮流优化 鲁棒强化学习 零和博弈 状态扰动 安全决策
分 类 号:TM744[电气工程—电力系统及自动化] TP181[自动化与计算机技术—控制理论与控制工程]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.170