基于强化学习的地铁站空调系统节能控制被引量：11

Energy saving control for subway station air conditioning systems based on reinforcement learning

作　　者：焦焕炎冯浩东魏东冉义兵胡朝文 JIAO Huan-yan;FENG Hao-dong;WEI Dong;RAN Yi-bing;HU Chao-wen(School of Electrical and Information Engineering,Beijing University of Civil Engineering and Architecture,Beijing 100044,China;Beijing Key Laboratory of Intelligent Processing for Building Big Data,Beijing 100044,China;Beijing Xingchuang Land Real Estate Development Co.,Ltd,Beijing 102600,China)

机构地区：[1]北京建筑大学电气与信息工程学院,北京100044 [2]建筑大数据智能处理方法研究北京市重点实验室,北京100044 [3]北京兴创置地房地产开发有限公司,北京102600

出　　处：《控制与决策》2022年第12期3139-3148,共10页Control and Decision

基　　金：北京市属高校高水平创新团队建设计划项目(IDHT20190506);北京市教委科技计划重点项目(KZ201810016019);北京建筑大学市属高校基本科研业务费专项资金项目(X20068)。

摘　　要：地铁站空调系统能源消耗较大,传统控制方法无法兼顾舒适性和节能问题,控制效果不佳,且目前地铁站空调控制系统均是对风系统和水系统单独控制,无法保证整个系统的节能效果.鉴于此,提出基于强化学习的空调系统节能控制策略.首先,采用神经网络建立空调系统模型,作为离线训练智能体的模拟环境,以解决无模型强化学习方法在线训练收敛时间长的问题;然后,为了提升算法效率,同时针对地铁站空调系统多维连续动作空间的特点,提出基于多步预测的深度确定性策略梯度算法,设计智能体框架,将其用于与环境模型进行交互训练;此外,为了确定最佳的训练次数,设置了智能体训练终止条件,进一步提升了算法效率;最后,基于武汉某地铁站的实测运行数据进行仿真实验,结果表明,所提出控制策略具有较好的温度跟踪性能,能够保证站台舒适性,且与目前实际系统相比能源节省约17.908%.The subway station air conditioning system consumes a lot of energy, and traditional control methods cannot take into account the comfort and energy saving issues together, resulting in poor control effect. Moreover, the current subway station air conditioning control systems control the air system and water system separately, which cannot guarantee the energy saving effect of the whole system. Therefore, this paper proposes an energy-saving control strategy for the system based on reinforcement learning. Firstly, this paper uses a neural network to establish an air conditioning system model as a simulation environment for offline training of the agent to solve the problem of long convergence time of model-free reinforcement learning methods for online training. Then, in order to improve the efficiency of the algorithm and also to address the characteristics of the multidimensional continuous action space of the air conditioning systems,this paper proposes a deep deterministic policy gradient algorithm based on multi-step prediction and designs an agent framework that will be used to interact with the environment model for training. In addition, in order to determine the optimal number of training times, the agent training termination condition is also set, which further improves the algorithm efficiency. Finally, simulation experiments are conducted based on the measured operational data of a subway station in Wuhan, and the results show that the proposed control strategy has better temperature tracking performance and can ensure the comfort of the platform, and the energy saving is about 17.908% compared with the current actual system.

关键词：强化学习深度确定性策略梯度法神经网络多步预测地铁站空调系统节能控制

分类号：TP273[自动化与计算机技术—检测技术与自动化装置]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于强化学习的地铁站空调系统节能控制被引量：11

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于强化学习的地铁站空调系统节能控制 被引量：11

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

基于强化学习的地铁站空调系统节能控制被引量：11