检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:罗文健 张靖[1] 何宇[1] 古庭赟 聂祥论 范璐钦 袁旭峰[1] 李博文 LUO Wenjian;ZHANG Jing;HE Yu;GU Tingyun;NIE Xianglun;FAN Luqin;YUAN Xufeng;LI Bowen(College of Electrical Engineering,Guizhou University,Guiyang 550025,Guizhou Province,China;Electric Power Research Institute of Guizhou Power Grid Co.,Ltd.,Guiyang 550002,Guizhou Province,China)
机构地区:[1]贵州大学电气工程学院,贵州省贵阳市550025 [2]贵州电网有限责任公司电力科学研究院,贵州省贵阳市550002
出 处:《电网技术》2023年第4期1601-1611,共11页Power System Technology
基 金:国家自然科学基金项目(51867005);黔科合支撑[2022]一般013;黔科合平台人才-GCC[2022]016-1。
摘 要:为提高清洁能源消纳率及减少碳排放对环境的污染,实现更具泛化能力、鲁棒性和高效性的区域综合能源系统优化调度,该文提出了基于优势柔性策略–评价(advantage learning soft actor-critic,ALSAC)算法和迁移学习的区域综合能源系统优化调度方法。利用环境信息与智能体进行通信交互,以低碳、经济为目标实现区域综合能源系统的优化调度。在文中分析了提升柔性策略–评价(soft actor-critic,SAC)鲁棒性的最大熵机制,并与基于策略梯度的多种深度强化学习算法和启发式算法进行了性能对比,随后将优势学习的思想引入SAC的Q值函数更新中,解决了算法对Q值的过估计问题,提升了算法的性能。为提高智能体的学习效率和应对新场景的泛化能力,加入了迁移学习的参数迁移。算例表明,基于ALSAC算法和迁移学习的优化调度策略具有较好的鲁棒性、泛化能力和高效的学习效率,实现区域综合能源系统的灵活高效调度。In order to improve the consumption rate of clean energy and reduce the pollution of carbon emissions to the environment,and to achieve a more generalized,robust and efficient regional integrated energy system optimal scheduling,this paper proposes an optimal scheduling of regional integrated energy system based on advantage learning soft actor-critic(ALSAC)algorithm and transfer learning.Using environmental information to communicate and interact with agents,the regional comprehensive energy system is dispatched and optimized for the purpose of low carbon and economy.In this paper,the maximum entropy mechanism for improving the robustness of soft actor-critic(SAC)is analyzed,and the performance is compared with various deep reinforcement learning algorithms and heuristic algorithms based on policy gradients.The idea of advantage learning is introduced into the update of the Q value function of SAC,which solves the problem of overestimating the Q value of the algorithm and improves the performance of the algorithm.In order to improve the learning efficiency of the agent and the generalization ability to deal with new scenarios,the parameter transfer of transfer learning is added.Calculation examples show that the optimal scheduling strategy based on ALSAC algorithm and transfer learning has good robustness,generalization ability and efficient learning efficiency,and realizes flexible and efficient scheduling of regional integrated energy systems.
关 键 词:区域综合能源系统 深度强化学习 柔性策略-评价 迁移学习 优势学习
分 类 号:TM721[电气工程—电力系统及自动化]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.249