基于改进式k-prototypes聚类的坏数据辨识与修正  被引量:9

Bad data identification and correction method based on improved k-prototypes clustering

在线阅读下载全文

作  者:王孝慈 董树锋[1] 刘育权 王莉 李俊格 Wang Xiaoci;Dong Shufeng;Liu Yuquan;Wang Li;Li Junge(School of Electrical Engineering,Zhejiang University,Hangzhou 310027,China;Guangzhou Power Supply Bureau Co.,Ltd.,Guangzhou 510620,China)

机构地区:[1]浙江大学电气工程学院,杭州310027 [2]广州供电局有限公司,广州510620

出  处:《电测与仪表》2022年第2期9-15,共7页Electrical Measurement & Instrumentation

基  金:国家重点研发计划资助项目(2016YFB0901300)。

摘  要:工业领域很多技术的实现都以准确的负荷数据为基础,而工厂现有的负荷数据测量体系常因为通信、存储等故障,导致负荷数据中出现大量坏数据。因此,提出基于改进式k-prototypes聚类的坏数据辨识与修正方法,通过在聚类中引入非负荷数据特征,削弱负荷坏数据对聚类结果的影响,使坏数据辨识和修复结果更准确。改进式k-prototypes算法通过随机初始化,并行计算择优,克服了标准k-prototypes容易随初始聚类中心陷入局部最优解的缺陷;并通过聚类数量的自适应处理,解决了主观决定聚类数量的问题。基于聚类结果,根据正态分布原则确定负荷数据可行域,识别坏数据,并利用类中心置换法进行修正。实验表明,该方法较只考虑负荷数据的模糊均值聚类法效果更好,坏数据识别的召回率与修正的准确率显著提高。The realization of many technologies in the industrial field is based on accurate load data,while the existing measurement system in factories often results in a large number of bad data due to communication and storage failures.Therefore,an industrial load data identification and correction method based on improved k-prototypes clustering algorithm is proposed to reduce the impact of bad load data on the clustering results by introducing characteristics of non-load data in clustering,so as to make the identification and repair results more accurate.Through random initialization and parallel calculation optimization,the improved k-prototypes algorithm overcomes the defect that standard algorithm tends to fall into the local optimal solution.And the problem of subjectively determining the number of clusters is solved by adaptive processing.Based on the clustering results,the feasible region of load data is determined according to the principle of normal distribution,and the bad data is identified.The identified bad data is corrected by centroid vector replacing.Experiments show that the proposed method outperforms the fuzzy C-means clustering method which only considers the load data,and the recall rate and correction accuracy of bad data identification are significantly improved.

关 键 词:k-prototypes聚类 混合数据集聚类 坏数据辨识 类中心置换修正法 工业负荷预处理 

分 类 号:TM734[电气工程—电力系统及自动化]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象