一种基于K-means的关联规则聚类算法  被引量:6

An Association Rule Clustering Algorithm Based on K-means

在线阅读下载全文

作  者:王琢[1] 荀亚玲[1] 张继福[1] 

机构地区:[1]太原科技大学计算机科学与技术学院,太原030024

出  处:《太原科技大学学报》2016年第6期429-437,共9页Journal of Taiyuan University of Science and Technology

摘  要:关联规则是数据挖掘领域中的主要研究内容之一。针对高维海量数据集,尤其当支持度和置信度阈值太低时,将生成大量冗余和相似的关联规则,从而对关联规则的理解和使用造成了困难。本文采用改进的K-means思想,给出了一种关联规则聚类算法:首先重新定义了冗余关联规则,并给出了删除的方法;然后定义了一种新的规则间相似性度量;最后利用K-means思想,采用最大三角形方法选取聚类的初始点,将相似的关联规则归为一类。实验验证该算法能够帮助用户快速有效地找到有用的关联规则,提高了关联规则的可理解性。Association rule is one of the main research contents in the filed of data mining. For mining the associa- tion rule on massive and high-dimensional data set, a large number of redundant and similar association rules will be generated if the support or confidence threshold is low, so that it will cause difficulties to understand and use the rules. In this paper, an association rule clustering algorithm is presented by using improved K-means idea, which improves the understandability of association rules. First, the redundant association rules is redefined, and a method of deleting the rules is presented. Second, a new similarity measure of the rules is defined according to the structure characteristics between antecedent and consequent of association rules. Third, by using largest triangle method to select the initial cluster points and the idea of K-means, a clustering algorithm of association rules is presented to analyze association rules after deleting redundant rules, and put them into one cluster, and let users find useful association rules quickly and efficiently. In the end, the experiments on celestial spectra data and simu- lated datasets verified the feasibility and effectiveness of the algorithm.

关 键 词:关联规则聚类算法 冗余关联规则 相似性度量 恒星光谱数据 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象