基于改进分层强化学习的CPS指令多目标动态优化分配算法  被引量:8

Multi-objective Dynamic Optimal Dispatch Method for CPS Order of Interconnected Power Grids Using Improved Hierarchical Reinforcement Learning

在线阅读下载全文

作  者:余涛[1] 王宇名 叶文加[1] 刘前进[1] 

机构地区:[1]华南理工大学电力学院,广东省广州市510640 [2]广东电网公司中山供电局,广东省中山市528400

出  处:《中国电机工程学报》2011年第19期90-96,共7页Proceedings of the CSEE

基  金:国家自然科学基金项目(50807016);广东省自然科学基金项目(9151064101000049);中央高校基本科研业务费专项资金(2009ZM0251)~~

摘  要:应用经典强化学习方法的控制性能标准(control performance standard,CPS)下自动发电控制(automatic generation control,AGC)指令(CPS指令)由调度端至电网各台机组的分配过程不可避免出现维数灾难问题。提出应用分层强化学习的方法,将全网机组按调频时延做初次分类,CPS指令逐层分配形成任务分层结构。在分层Q学习算法层与层之间引入一个时变协调因子,改进的分层Q学习算法有效提高原算法收敛速度。奖励函数中设计不同的权值线性组合,展示保守及乐观控制下系统CPS控制水平和调节成本的变化关系。南方电网统计性仿真分析表明,改进分层Q学习算法较分层Q学习算法平均收敛时间缩短47%,在复杂随机扰动的环境中改进算法能有效提高系统CPS考核合格率,并降低调节成本约5%。This paper presented an improved hierarchical reinforcement learning (HRL) algorithm to solve the curse of dimensionality problem in the multi-objective dynamic optimization of automatic generation control (AGC) order dispatch based on control performance standard (CPS), The CPS order dispatch task was decomposed into several subtasks by classifying the AGC committed units according to their response time delay of power regulatng. A time-va~'ing coordination factor was introduced between layers of HRL to speed up the algorithm. Numbers of linear combination of weights in reward function were designed to optimize hydro capacity margin and AGC production cost. The application of improved hierarchical Q-learning in the China southern power grid model shows that the proposed method can speed up the algorithm by 47%, enhance the performance of AGC systems in CPS assessment, and save AGC production cost over 5%, compared with the hierarchical Q-learning and genetic algorithm.

关 键 词:分层强化学习 协调因子 随机优化 控制性能标准 自动发电控制 

分 类 号:TM71[电气工程—电力系统及自动化]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象