基于电力数据聚类分析的算法改进  被引量:3

The K-means Algorithm Improvement Base on Power Data Clustering Analysis

在线阅读下载全文

作  者:杨莉 沈鑫 李英娜[2] 李萌萌[2] YANG Li;SHEN Xin;LI Yingna;LI Mengmeng(Electric Power Research Institute of Yunnan Power Grid Company, Kunming 650217, China;Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650500, China)

机构地区:[1]云南电网有限责任公司电力科学研究院,昆明650217 [2]昆明理工大学信息工程与自动化学院,昆明650500

出  处:《云南电力技术》2017年第6期64-68,共5页Yunnan Electric Power

摘  要:为了提高信息检索的精准性,在电力数据搜索引擎中采用K-means算法,并针对K-means算法初始聚类中心的随机性对聚类效果的影响及K值的不确定性带来的聚类结果不稳定的问题,提出了一种改进的K-means算法。该改进算法基于欧氏最远距离选择初始簇心,通过多次聚类探测,对聚类结果的聚类均值总方差进行统计,选取方差值不再减小时对应的K值为聚类数。测试结果证明改进的K-means算法在实现自动聚类的同时聚类效果提高10%,在克服原算法缺点的同时保留了其简单高效的优点。In order to improve the accuracy of information retrieval, using the K-means algorithm in search engines. However, the traditional K-means algorithm has some shortcomings. Since the randomness of the initial clustering center of K-means algorithm affects the clustering results and the uncertainty of K value brings the unstable clustering results, this paper proposes an improved K-means algorithm. The algorithm is based on Euclidean distance to choose the initial cluster center. It counts the clustering mean total variance of clustering results through multiple cluster detection, selecting the K value of which variance values are no longer reduced as the clustering number. Test results show that the clustering effect of the improved K-means algorithm increases by 10 %, while implementing automatic clustering. It overcomes the shortcomings of the original algorithm, while retaining the advantages of its simple and efficient.

关 键 词:K-MEANS K值 初始聚类中心 聚类均值总方差 

分 类 号:TM769[电气工程—电力系统及自动化] TP311.13[自动化与计算机技术—计算机软件与理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象