机构地区:[1]河北工业大学电子信息工程学院,天津市北辰区300401 [2]朗新科技集团股份有限公司,天津市和平区300041
出 处:《电网技术》2022年第6期2104-2113,共10页Power System Technology
基 金:国家重点研发计划“智能机器人”重点专项(2019YFB1312102);河北省自然科学基金(F2019202364)。
摘 要:负荷曲线聚类是电力大数据研究的基础,通过聚类来挖掘用户的用电模式,从而为电力调控提供决策。针对传统的聚类方法难以处理高维多变量数据,提取时间特征较困难,存在特征提取与聚类过程分离的问题,采用基于一维卷积自编码器的深度卷积嵌入聚类方法(deep convolutional embedded clustering based on one-dimensional convolution autoencoder,DCEC-1D),对负荷曲线进行聚类并提取典型负荷曲线。首先,用一维卷积自编码器(one-dimensional convolutional autoencoder,1D-CAE)提取特征,送入K-means得到初始簇中心;然后,利用自定义的聚类层对提取的负荷特征进行软分布;最后,为防止扭曲嵌入空间,将聚类损失和重构损失相结合作为损失函数联合优化,得到最终的聚类结果。算例分析以美国加州大学欧文分校(University of California Irvine,UCI)提出的数据集中的葡萄牙居民用户实际采集数据为研究对象,通过戴维森堡丁指数(Davies-Bouldin index,DBI),CH分数(Calinski-Harabaz index,CHI),轮廓系数(Silhouette coefficient,SC)这3个聚类指标进行定量分析,并通过t分布随机邻域嵌入(t-distributed stochastic neighborhood embedding,TSNE)进行可视化分析。试验结果表明,相较于传统的K-means、主成分分析法(principal components analysis,PCA)+K-means,该方法聚类指标有大幅度提升。对比基于局部结构保留的深度嵌入聚类(improved deep embedded clustering,IDEC),基于一维卷积的深度嵌入聚类(deep embedding clustering method based on one dimensional convolutional auto-encoder,DEC-1D-CAE)和1D-CAE+K-means,所提方法的DBI分别降低了约0.15、0.08和1.50,CHI提高了约19384.92、12488.48和36485.72,SC提高了约0.10、0.05和0.63。Load curve clustering is the basis of electric power big data research.Through clustering,users'power consumption patterns can be mined to provide decision-making for power regulation.In view of the difficulty of traditional clustering methods to process high-dimensional multivariate data,and the difficulty of extracting time features,there is a problem of separation of feature extraction and clustering process.In this paper,a deep convolution embedding clustering method based on one-dimensional convolution autoencoder(DCEC-1D)is used to cluster the load curves and extract the typical load curves.This method first used a one-dimensional convolutional autoencoder(1D-CAE)to extract features,sended it to K-means to get the initial cluster center,and then used a custom clustering layer to softly distribute the extracted load features.Finally,in order to prevent distortion of the embedding space,the clustering loss and reconstruction loss are combined as a joint optimization of the loss function,and the final clustering result is obtained.The analysis of the calculation example takes the actual data collected by Portuguese residents in the University of California Irvine(UCI)data set as the research object.The three clustering indicators of Davies-Bouldin index(DBI),Calinski-Harabaz index(CHI)and Silhouette coefficient(SC)are used for quantitative analysis,and through t-distributed stochastic neighborhood embedding(TSNE)for visual analysis.The experimental results show that compared with the traditional K-means,the clustering index of this method is greatly improved by Principal components analysis(PCA)+K-means.Compared with IDEC,DEC-1D-CAE and 1D-CAE+K-means,1the DBI of this method is reduced by about 0.15,0.08 and 1.50,the CHI of this methodis increased by about 19 384.92, 12 488.48 and 36 485.72, and the SC indexis increased by about 0.10, 0.05 and 0.63.
关 键 词:深度嵌入聚类 卷积自编码器 时序特征提取 典型负荷曲线 联合优化
分 类 号:TM721[电气工程—电力系统及自动化]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...