检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:郑麟[1]
出 处:《计算机应用与软件》2014年第4期297-301,326,共6页Computer Applications and Software
摘 要:针对Apriori算法的不足,提出基于项数布尔矩阵的改进算法MPIN_Apriori。改进算法运用分治思想将数据集分段处理,使用事务项数进行矩阵压缩并利用向量交运算和先验剪枝直接生成局部频繁k-项集,最终合并为全局频繁k-项集。该算法从根本上改进了Apriori算法频繁迭代的流程,避免了连接运算而且极大减轻了内存负担。实验结果表明在进行大型数据库频繁项集挖掘时其效率明显高于Apriori算法,而且对分布式数据挖掘有参考价值。We propose an improved algorithm named MPIN_Apriori which is based on the Boolean matrix of number of items aiming at the disadvantage of Apriori algorithm. The improved algorithm uses the divide-and-conquer idea to divide the dataset into segments for processing, uses number of terms of transaction to compress the matrix and utilises vector intersection operation and priori pruning to generate local frequent k-itemsets directly,and finally merges them into global frequent k-itemsets. The algorithm fundamentally improves the frequently iterative process of the Apriori algorithm,avoids the concatenation operation and greatly reduces the burden of the memory. Experimental results show that its efficiency is significantly higher than the Apriori algorithm during frequent itemsets mining on a large database,and it also has the reference value for distributed data mining.
关 键 词:APRIORI算法 频繁项集 项数布尔矩阵 分治
分 类 号:TP301.6[自动化与计算机技术—计算机系统结构]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:18.191.163.22