检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:李文武[1,2] 刘江鹏 蒋志强[3] 裴本林 李黄强 LI Wen-wu;LIU Jiang-peng;JIANG Zhi-qiang;PEI Ben-lin;LI Huang-qiang(College of Electrical Engineering&New Energy,China Three Gorges University,Yichang 443002,China;Hubei Key Laboratory of Cascade Hydropower Station Operation&Control,China Three Gorges University,Yichang 443002,China;School of Hydropower and Information Engineering,Huazhong University of Science and Technology,Wuhan 430074,China;Yichang Yineng hydropower Co.,L td.,Yichang 443000,China;Yichang Power Supply Company,Hubei Electric Power Company,Yichang 443000,China)
机构地区:[1]三峡大学电气与新能源学院,湖北宜昌443002 [2]三峡大学梯级水电站运行与控制湖北省重点实验室,湖北宜昌443002 [3]华中科技大学水电与数字化工程学院,湖北武汉430074 [4]宜昌宜能水电有限责任公司,湖北宜昌443000 [5]国网湖北省电力有限公司宜昌供电公司,湖北宜昌443000
出 处:《水电能源科学》2020年第12期53-57,共5页Water Resources and Power
基 金:国家自然科学基金项目(51809098);梯级水电站运行与控制湖北省重点实验室(三峡大学)开放基金项目(2019KJX08)。
摘 要:针对强化学习的SARSA算法在求解水库随机优化问题中存在的优化性能不高、收敛速度较慢的问题,提出采用基于强化学习的HSARSA(λ)算法进行求解。先在SARSA算法基础上引入效用迹函数得到SARSA(λ)算法,然后加入启发函数得到HSARSA(λ)算法,最后通过不断调整HSARSA(λ)算法的学习率α、折扣因子γ、衰减因子λ等参数求解水库长期随机优化调度问题。实例应用表明,HSARSA(λ)相较于SARSA、SARSA(λ)算法可提升求解精度,减少最优近似解寻优迭代次数,为水库随机优化调度问题提供了一种新的求解思路。Aiming at the problems of poor optimization performance and slow convergence speed of the SARSA algorithm of reinforcement learning in solving the stochastic optimization problem of reservoir,an HSARSA (λ)algorithm based reinforcement learning was proposed to solve the problem.Firstly,based on the SARSA algorithm,the eligibility trace was introduced to obtain the SASAR (λ)algorithm.And then the heuristic function was added to obtain the HSARSA (λ)algorithm.Finally,the parameters of learning rateα,discount factorγ,and attenuation factorλof the HSARSA (λ)algorithm were continuously adjusted to solve the long-term stochastic optimal scheduling problem of reservoir.The example shows that the HSARSA (λ)can improve the accuracy of the solution and reduce the number of iterations of the optimal approximate solution.Compared with SARSA(λ)and SARSA algorithms,the research results provide a new way for solving the stochastic optimal scheduling of reservoir.
关 键 词:随机优化调度 强化学习 HSARSA(λ)算法 效用迹函数 启发函数
分 类 号:TV697.1[水利工程—水利水电工程]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:52.14.236.216