一种直接生成频繁项集的分治Apriori算法  被引量:14

A DIVIDE-AND-CONQUER APRIORI ALGORITHM DIRECTLY GENERATING FREQUENT ITEMSETS

在线阅读下载全文

作  者:郑麟[1] 

机构地区:[1]武汉大学计算机学院,湖北武汉430000

出  处:《计算机应用与软件》2014年第4期297-301,326,共6页Computer Applications and Software

摘  要:针对Apriori算法的不足,提出基于项数布尔矩阵的改进算法MPIN_Apriori。改进算法运用分治思想将数据集分段处理,使用事务项数进行矩阵压缩并利用向量交运算和先验剪枝直接生成局部频繁k-项集,最终合并为全局频繁k-项集。该算法从根本上改进了Apriori算法频繁迭代的流程,避免了连接运算而且极大减轻了内存负担。实验结果表明在进行大型数据库频繁项集挖掘时其效率明显高于Apriori算法,而且对分布式数据挖掘有参考价值。We propose an improved algorithm named MPIN_Apriori which is based on the Boolean matrix of number of items aiming at the disadvantage of Apriori algorithm. The improved algorithm uses the divide-and-conquer idea to divide the dataset into segments for processing, uses number of terms of transaction to compress the matrix and utilises vector intersection operation and priori pruning to generate local frequent k-itemsets directly,and finally merges them into global frequent k-itemsets. The algorithm fundamentally improves the frequently iterative process of the Apriori algorithm,avoids the concatenation operation and greatly reduces the burden of the memory. Experimental results show that its efficiency is significantly higher than the Apriori algorithm during frequent itemsets mining on a large database,and it also has the reference value for distributed data mining.

关 键 词:APRIORI算法 频繁项集 项数布尔矩阵 分治 

分 类 号:TP301.6[自动化与计算机技术—计算机系统结构]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象