Q学习差分进化算法求解热电动态经济排放调度  被引量:1

A Q-Learning Differential Evolution Algorithm for Combined Heat and Power Dynamic Economic Emission Dispatch

在线阅读下载全文

作  者:方帅 陈旭[1] 李康吉[1] FANG Shuai;CHEN Xu;LI Kangji(School of Electrical and Information Engineering,Jiangsu University,Zhenjiang 212013,China)

机构地区:[1]江苏大学电气信息工程工程学院,江苏镇江212013

出  处:《电子科技》2024年第5期9-17,共9页Electronic Science and Technology

基  金:国家自然科学基金(61873114);江苏大学农业装备学部青年计划项目(NZXB20210211)。

摘  要:热电联产动态经济排放调度同时考虑了燃料成本花费和污染气体排放两个目标值,且下一时间段的热电产量受当前时间段热电产量的影响,这是近年来电力系统运行中的一个重要问题。文中提出一种基于Q学习强化多目标差分进化(Q Learning Multi-Objective Differential Evolution,QLMODE)算法,以此求解热电联产动态经济排放调度(Combined Heat and Power Dynamic Economic Emission Dispatch,CHPDEED)问题。在QLMODE中,采用Q学习技术调整算法的比例因子参数,即在迭代过程中利用子代解和父代解之间的支配关系确定动作奖励和惩罚,并通过Q学习调整参数值,以获得最适合环境模型的算法参数。文中将所提QLMODE用于求解11机组和33机组的热电联产动态经济排放调度问题。仿真结果表明,与4种成熟的多目标优化算法相比,QLMODE算法燃料成本最小,污染气体排放最少,收敛性和多样性指标优于其他4种算法,且QLMODE在两组问题上都获得了更好的Pareto最优前沿。The dynamic economic emission scheduling of cogeneration takes into account both fuel cost and pollution gas emission,and the thermoelectricity output in the next period is affected by the thermoelectricity output in the current period,which is an important problem in power system operation in recent years.In this study,a new QLMODE(Q-Learning Multi-Objective Differential Evolution)algorithm is proposed to solve the CHPDEED(Combined Heat and Power Dynamic Economic Emission Dispatch)problem.In QLMODE,the Q-learning technique is used to adjust the scale factor parameters of the algorithm,that is,in the iterative process,the action reward and punishment are determined by using the dominant relationship between the child solution and the parent solution,and the parameter values are adjusted by Q-learning to obtain the most suitable algorithm parameters for the environmental model.The proposed QLMODE is used to solve the CHPDEED with 11 units and 33 units.The simulation results show that compared with four mature multi-objective optimization algorithms,the QLMODE algorithm has the least fuel cost and the least pollution gas emission,the convergence and diversity index of QLMODE algorithm is better than the other four algorithms,and QLMODE has a better Pareto optimal frontier on both sets of problems.

关 键 词:Q学习 强化学习 多目标算法 差分进化 热电联产 经济排放调度 动态调度 电力系统 

分 类 号:TP18[自动化与计算机技术—控制理论与控制工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象