基于优势柔性策略-评价算法和迁移学习的区域综合能源系统优化调度被引量：6

Optimal Scheduling of Regional Integrated Energy System Based on Advantage Learning Soft Actor-critic Algorithm and Transfer Learning

作　　者：罗文健张靖[1] 何宇[1] 古庭赟聂祥论范璐钦袁旭峰[1] 李博文 LUO Wenjian;ZHANG Jing;HE Yu;GU Tingyun;NIE Xianglun;FAN Luqin;YUAN Xufeng;LI Bowen(College of Electrical Engineering,Guizhou University,Guiyang 550025,Guizhou Province,China;Electric Power Research Institute of Guizhou Power Grid Co.,Ltd.,Guiyang 550002,Guizhou Province,China)

机构地区：[1]贵州大学电气工程学院,贵州省贵阳市550025 [2]贵州电网有限责任公司电力科学研究院,贵州省贵阳市550002

出　　处：《电网技术》2023年第4期1601-1611,共11页Power System Technology

基　　金：国家自然科学基金项目(51867005);黔科合支撑[2022]一般013;黔科合平台人才-GCC[2022]016-1。

摘　　要：为提高清洁能源消纳率及减少碳排放对环境的污染,实现更具泛化能力、鲁棒性和高效性的区域综合能源系统优化调度,该文提出了基于优势柔性策略–评价(advantage learning soft actor-critic,ALSAC)算法和迁移学习的区域综合能源系统优化调度方法。利用环境信息与智能体进行通信交互,以低碳、经济为目标实现区域综合能源系统的优化调度。在文中分析了提升柔性策略–评价(soft actor-critic,SAC)鲁棒性的最大熵机制,并与基于策略梯度的多种深度强化学习算法和启发式算法进行了性能对比,随后将优势学习的思想引入SAC的Q值函数更新中,解决了算法对Q值的过估计问题,提升了算法的性能。为提高智能体的学习效率和应对新场景的泛化能力,加入了迁移学习的参数迁移。算例表明,基于ALSAC算法和迁移学习的优化调度策略具有较好的鲁棒性、泛化能力和高效的学习效率,实现区域综合能源系统的灵活高效调度。In order to improve the consumption rate of clean energy and reduce the pollution of carbon emissions to the environment,and to achieve a more generalized,robust and efficient regional integrated energy system optimal scheduling,this paper proposes an optimal scheduling of regional integrated energy system based on advantage learning soft actor-critic(ALSAC)algorithm and transfer learning.Using environmental information to communicate and interact with agents,the regional comprehensive energy system is dispatched and optimized for the purpose of low carbon and economy.In this paper,the maximum entropy mechanism for improving the robustness of soft actor-critic(SAC)is analyzed,and the performance is compared with various deep reinforcement learning algorithms and heuristic algorithms based on policy gradients.The idea of advantage learning is introduced into the update of the Q value function of SAC,which solves the problem of overestimating the Q value of the algorithm and improves the performance of the algorithm.In order to improve the learning efficiency of the agent and the generalization ability to deal with new scenarios,the parameter transfer of transfer learning is added.Calculation examples show that the optimal scheduling strategy based on ALSAC algorithm and transfer learning has good robustness,generalization ability and efficient learning efficiency,and realizes flexible and efficient scheduling of regional integrated energy systems.

关键词：区域综合能源系统深度强化学习柔性策略-评价迁移学习优势学习

分类号：TM721[电气工程—电力系统及自动化]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于优势柔性策略-评价算法和迁移学习的区域综合能源系统优化调度被引量：6

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于优势柔性策略-评价算法和迁移学习的区域综合能源系统优化调度 被引量：6

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

基于优势柔性策略-评价算法和迁移学习的区域综合能源系统优化调度被引量：6