基于自监督网络的DDPG算法的建筑能耗控制  被引量:1

Building Energy Consumption Control Based on DDPG Algorithm of Self-supervised Network

在线阅读下载全文

作  者:殷雨竹 陈建平[2,3] 傅启明[1,2,3] 陆悠 吴宏杰[1,2,3] YIN Yu-Zhu;CHEN Jian-Ping;FU Qi-Ming;LU You;WU Hong-Jie(School of Electronic and Information Engineering,Suzhou University of Science and Technology,Suzhou 215009,China;Jiangsu Province Key Laboratory of Intelligent Building Energy Efficiency,Suzhou University of Science and Technology,Suzhou 215009,China;Suzhou Key Laboratory of Mobile Network Technology and Application,Suzhou University of Science and Technology,Suzhou 215009,China)

机构地区:[1]苏州科技大学电子与信息工程学院,苏州215009 [2]苏州科技大学江苏省建筑智慧节能重点实验室,苏州215009 [3]苏州科技大学苏州市移动网络技术与应用重点实验室,苏州215009

出  处:《计算机系统应用》2022年第2期161-167,共7页Computer Systems & Applications

基  金:国家重点研发计划(2020YFC200660);国家自然科学基金(62072324,61876217,61876121,61772357);江苏省重点研发计划(BE2017663)。

摘  要:针对强化学习方法训练能耗控制系统时所存在奖赏稀疏的问题,将一种基于自监督网络的深度确定策略梯度(deep deterministic policy gradient,DDPG)方法应用到建筑能耗控制问题中.首先,处理状态和动作变量作为自监督网络前向模型的输入,预测下一个状态特征向量,同时将预测误差作为好奇心设计内部奖赏,以解决奖赏稀疏问题.然后,采用数据驱动的方法训练建筑能耗模型,构建天气数据作为输入、能耗数据作为输出.最后,利用基于自监督网络的DDPG方法求解最优控制策略,并以此设定空气处理装置(air handling unit,AHU)的最优排放温度,减少设备能耗.实验结果表明,该方法能够在保持建筑环境舒适的基础上,实现较好的节能效果.In view of the sparse reward problem in the training of energy consumption control systems using reinforcement learning methods,a deep deterministic policy gradient(DDPG)method based on the self-supervised network is applied to the building energy consumption control.First,the processing state and action variables are regarded as the input of the self-supervised network forward model,predicting the feature vector of the next state and using the prediction error as the internal reward of curiosity to solve the sparse reward problem.Then,a data-driven method is used to train the building energy consumption model with weather data as input and energy consumption data as output.Finally,the DDPG method based on the self-supervised network is used to develop the optimal control strategy,and the optimal discharge temperature of the air handling unit(AHU)is set based on the strategy to reduce the energy consumption of the equipment.Experimental results show that this method can achieve good energy-saving effects on the basis of maintaining a comfortable building environment.

关 键 词:强化学习 自监督网络 DDPG算法 能耗控制 

分 类 号:TP18[自动化与计算机技术—控制理论与控制工程] TU111.195[自动化与计算机技术—控制科学与工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象