检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:王翘楚 丁研[1] 梁传志[2] 张颢正 黄宸 Wang Qiaochu;Ding Yan;Liang Chuanzhi;Zhang Haozheng;Huang Chen(School of Environmental Science and Engineering,Tianjin University,Tianjin 300354,China;Technology and Industrialization Development Center of the Ministry of Housing and Urban Rural Development,Beijing 100835,China)
机构地区:[1]天津大学环境科学与工程学院,天津300354 [2]住房和城乡建设部科技与产业化发展中心,北京100835
出 处:《系统仿真学报》2024年第12期2884-2893,共10页Journal of System Simulation
摘 要:为解决空调调度在线部署初期,低质量数据工况存在性能不稳定与训练过程效率低下的问题,提出一种基于迁移模仿学习的空调仿真调度策略制定方法。通过强化学习方法获得建筑运行策略,建立标准建筑仿真模型作为源域部署迁移学习,模仿学习损失函数被应用于智能体损失函数中以增强算法性能。结果表明:相比未采用迁移学习的方法,运行效益提升了16.2%,有效解决了强化学习训练初期的运行不稳定问题;相比未采用模仿学习的方法,运行效益提升了11.5%,有效提高了强化学习的训练效率。To solve the problem of unstable performance and inefficient training process of low-quality data conditions at the initial stage of online deployment of air conditioner scheduling,we propose a migration-imitation learning-based air conditioning scheduling strategy simulation method.Reinforcement learning methods are used to generate building operation strategies.A standard building simulation model serves as the source domain,upon which migration learning is applied.An imitation learning loss function is incorporated into the intelligent loss function to enhance algorithm performance.The results indicate that,compared with the non-use of migration learning,the proposed method can improve the operational efficiency by 16.2%,effectively resolving the operational instability issues at the initial stage of reinforcement learning training.Compared to methods without imitation learning,operational efficiency is enhanced by 11.5%,significantly improving the training efficiency of reinforcement learning.
关 键 词:迁移学习 强化学习 模仿学习 空调调控方法 室温控制
分 类 号:TP29[自动化与计算机技术—检测技术与自动化装置]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.33