检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:焦焕炎 冯浩东 魏东 冉义兵 胡朝文 JIAO Huan-yan;FENG Hao-dong;WEI Dong;RAN Yi-bing;HU Chao-wen(School of Electrical and Information Engineering,Beijing University of Civil Engineering and Architecture,Beijing 100044,China;Beijing Key Laboratory of Intelligent Processing for Building Big Data,Beijing 100044,China;Beijing Xingchuang Land Real Estate Development Co.,Ltd,Beijing 102600,China)
机构地区:[1]北京建筑大学电气与信息工程学院,北京100044 [2]建筑大数据智能处理方法研究北京市重点实验室,北京100044 [3]北京兴创置地房地产开发有限公司,北京102600
出 处:《控制与决策》2022年第12期3139-3148,共10页Control and Decision
基 金:北京市属高校高水平创新团队建设计划项目(IDHT20190506);北京市教委科技计划重点项目(KZ201810016019);北京建筑大学市属高校基本科研业务费专项资金项目(X20068)。
摘 要:地铁站空调系统能源消耗较大,传统控制方法无法兼顾舒适性和节能问题,控制效果不佳,且目前地铁站空调控制系统均是对风系统和水系统单独控制,无法保证整个系统的节能效果.鉴于此,提出基于强化学习的空调系统节能控制策略.首先,采用神经网络建立空调系统模型,作为离线训练智能体的模拟环境,以解决无模型强化学习方法在线训练收敛时间长的问题;然后,为了提升算法效率,同时针对地铁站空调系统多维连续动作空间的特点,提出基于多步预测的深度确定性策略梯度算法,设计智能体框架,将其用于与环境模型进行交互训练;此外,为了确定最佳的训练次数,设置了智能体训练终止条件,进一步提升了算法效率;最后,基于武汉某地铁站的实测运行数据进行仿真实验,结果表明,所提出控制策略具有较好的温度跟踪性能,能够保证站台舒适性,且与目前实际系统相比能源节省约17.908%.The subway station air conditioning system consumes a lot of energy, and traditional control methods cannot take into account the comfort and energy saving issues together, resulting in poor control effect. Moreover, the current subway station air conditioning control systems control the air system and water system separately, which cannot guarantee the energy saving effect of the whole system. Therefore, this paper proposes an energy-saving control strategy for the system based on reinforcement learning. Firstly, this paper uses a neural network to establish an air conditioning system model as a simulation environment for offline training of the agent to solve the problem of long convergence time of model-free reinforcement learning methods for online training. Then, in order to improve the efficiency of the algorithm and also to address the characteristics of the multidimensional continuous action space of the air conditioning systems,this paper proposes a deep deterministic policy gradient algorithm based on multi-step prediction and designs an agent framework that will be used to interact with the environment model for training. In addition, in order to determine the optimal number of training times, the agent training termination condition is also set, which further improves the algorithm efficiency. Finally, simulation experiments are conducted based on the measured operational data of a subway station in Wuhan, and the results show that the proposed control strategy has better temperature tracking performance and can ensure the comfort of the platform, and the energy saving is about 17.908% compared with the current actual system.
关 键 词:强化学习 深度确定性策略梯度法 神经网络 多步预测 地铁站空调系统 节能控制
分 类 号:TP273[自动化与计算机技术—检测技术与自动化装置]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.15