检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:陈准 潘毅 范士雄 许丹 丁强 蔡帜 CHEN Zhun;PAN Yi;FAN Shixiong;XU Dan;DING Qiang;CAI Zhi(Beijing Key Laboratory of Research and System Evaluation of Power Dispatching Automation Technology(China Electric Power Research Institute Co.,Ltd.),Haidian District,Beijing 100192,China)
机构地区:[1]电力调度自动化技术研究与系统评价北京市重点实验室(中国电力科学研究院有限公司),北京市海淀区100192
出 处:《电力信息与通信技术》2023年第3期33-40,共8页Electric Power Information and Communication Technology
基 金:国家电网有限公司总部科技项目资助“面向碳达峰、碳中和目标的一二次能源综合平衡分析决策技术研究”(5100-202155294A-0-0-00)。
摘 要:针对电网大规模机组组合优化问题,文章提出一种将指针网络与强化学习相融合的深度强化学习方法。首先,充分考虑电力系统以及火电机组的各类约束条件的限制,建立以发电成本最小为目标函数的机组组合强化学习环境;其次,在优化计算方面,提出一种将指针网络和Actor-Critic模型相结合的深度强化学习方法,形成从预测数据到机组开停方式的快速映射,从而达到快速求解机组组合问题的目的。采用10/200机24时段进行算例验证,结果表明,相较于使用传统数学规划方法的计算结果,所提方法能够更加快速地得到机组组合结果,通过应用指针网络作为强化学习模型的策略网络,能够加强网络提取特征的能力,提升计算结果的准确性。Aiming at the optimization problem of large-scale unit commitment in power grid,a deep reinforcement learning method combining pointer network and reinforcement learning is proposed.Firstly,the constraints of power system and thermal power units are fully considered,and the reinforcement learning environment of unit commitment with the minimum generation cost as the objective function is established;Secondly,in terms of optimization calculation,a deep reinforcement learning method combining pointer network and actor critical model is proposed,which forms a fast mapping from prediction data to unit startup and shutdown mode,so as to achieve the purpose of quickly solving unit commitment problems.The results for systems up to 10/200 units and 24 hours show that compared with the calculation results of traditional mathematical programming method,the method proposed in this paper can get the unit commitment results more quickly.By using pointer network as the policy network of reinforcement learning model,the ability of network feature extraction can be strengthened and the accuracy of calculation results can be improved.
关 键 词:机组组合 深度强化学习 指针网络 Actor-Critic模型
分 类 号:TN915.853[电子电信—通信与信息系统]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.222