一种基于压缩矩阵的Apriori算法改进研究  被引量:46

Research on Improved Apriori Algorithm Based on Compressed Matrix

在线阅读下载全文

作  者:罗丹[1] 李陶深[1] 

机构地区:[1]广西大学计算机与电子信息学院,南宁530004

出  处:《计算机科学》2013年第12期75-80,共6页Computer Science

基  金:国家自然科学基金项目(60973074)资助

摘  要:针对已有基于矩阵的Apriori算法存在的问题,提出了一种改进的基于压缩矩阵的Apriori算法。算法进行了以下方面的改进:增加了两个数组,分别用于记录矩阵行与列中1的个数,使得算法在压缩矩阵时减少了扫描矩阵的次数;在压缩矩阵中,通过增加删除不能连接的项集和非频繁的项集的操作,使得矩阵压缩得更小,提高了空间效率;改变了删除事务列的条件和算法结束的条件,以减少挖掘结果的误差和算法循环的次数。算法性能分析和实验分析证明,改进后的算法能有效地挖掘频繁项集,并且比现有的算法具有更高的计算效率。Aiming at the deficiency of the existing Apriori algorithm, an improved Apriori algorithm based on com- pressed matrix called NCM_Apriori__l was proposed. The improvements of this algorithm are as follows. (1) adding two arrays to record the counts of 1 in the row and column, so that the number of scanning the matrix can be reduced during compressing, (2)deleting the unnecessary itemsets which can't be connected as well as the infrequent ones in compress- ing matrix to minify the scale of matrix and improve space utilization, (3)changing the condition of deleting the unneces- sary transactions to reduce the errors of the mine result, and changing the stopping condition to make the number of cy- cle decreased. Algorithm performance analysis and experiments results prove that the improved algorithm can mine fre- quent itemsets effectively and has better efficiency of computing than existing Apriori algorithms based on compressed matrix.

关 键 词:数据挖掘 频繁项集 APRIORI算法 压缩矩阵 

分 类 号:TP311[自动化与计算机技术—计算机软件与理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象