基于深度强化学习的逆变器多频点控制参数优化

Optimization of Inverter Multi-Frequency Control Parameters Based on Deep Reinforcement Learning

作　　者：覃日升况华姜訸于辉李虹万明凯殷一林雷万钧[5] QIN Risheng;KUANG Hua;JIANG He;YU Hui;LI Hong;WAN Mingkai;YIN Yilin;LEI Wanjun(Electric Power Science Research Institute of Yunnan Power Grid Co.,Ltd.,Kunming 650214,Yunnan,China;Yunnan Power Grid Co.,Ltd.,Kunming 650011,Yunnan,China;Honghe Power Supply Bureau of Yunnan Power Grid Co.,Ltd.,Honghe 661199,Yunnan,China;Dali Power Supply Bureau of Yunnan Power Grid Co.,Ltd.,Dali 672699,Yunnan,China;School of Electrical Engineering,Xi’an Jiaotong University,Xi’an 710049,Shaanxi,China)

机构地区：[1]云南电网有限责任公司电力科学研究院,云南昆明650214 [2]云南电网有限公司,云南昆明650011 [3]云南电网有限责任公司红河供电局,云南红河661199 [4]云南电网有限责任公司大理永平供电局,云南大理672699 [5]西安交通大学电气工程学院,陕西西安710049

出　　处：《电网与清洁能源》2024年第7期124-132,共9页Power System and Clean Energy

基　　金：国家重点研发计划项目(2018YFB0905800)。

摘　　要：传统逆变器闭环控制具有良好的静态和动态性能,但非常依赖精确的系统数学模型,难以适应逆变器接入不同负载或电网环境等原因带来的模型参数扰动。将该深度强化学习应用于逆变器多频点控制参数整定过程。文章首先建立了单相电压型逆变器的控制模型;在此基础上分析了逆变器参数调节过程中的不稳定特征,并设计了一种基于FFT的不稳定特征判断方法,实现逆变器参数调节过程中稳定状态的在线监测;其次对逆变器控制参数自适应过程进行马尔可夫过程建模,设计了智能体的状态、动作和奖励函数;针对智能体样本不平衡问题引入了经验优先级回放以及动作屏蔽机制提高智能体的学习效率;经过仿真学习训练,智能体实现比例谐振控制器参数自整定以获取最佳的多频点跟踪性能;最后在搭建的实验平台上进行实验,结果表明:训练后的智能体可以在较少次数训练后获得满足各频点控制精度要求的控制参数,同时整个训练过程系统都是稳定的。Traditional inverter closed-loop control has good static and dynamic performance,but it is heavily dependent on the precise mathematical model of the system,and it is difficult to adapt to the model parameter disturbance caused by the inverter connected to different loads or power grid environment.In this paper,deep reinforcement learning is applied to the multi-frequency control parameter setting process of the inverter.Firstly,a method of approximate calculation of resonance coefficient based on steady-state error index is proposed.Based on this method,the resonance coefficient values under different frequency points and different load requirements are analyzed,which has guiding significance for the parameter design of resonance coefficient.Secondly,the stability characteristic judgment method based on FFT is designed to realize the on-line monitoring of the output stability state of the inverter.On this basis,Markov process modeling is carried out for the adaptive process of inverter control parameters,and the state,action and reward functions of the agent are designed.To address unbalanced agent samples,the experience priority playback and action shielding mechanism are introduced to improve the learning efficiency of the agent.After simulation learning training,the agent realizes the parameter self-tuning of the proportional resonance controller to obtain the best multifrequency tracking performance.Finally,experiments are carried out on the experimental platform,and the results show that the trained agent can obtain control parameters that meet the control accuracy requirements of each frequency after less training,and the whole training process system is stable.

关键词：逆变器比例谐振控制深度强化学习参数优化

分类号：TM464[电气工程—电器]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于深度强化学习的逆变器多频点控制参数优化

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于深度强化学习的逆变器多频点控制参数优化

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索