基于间隔链表改进的频繁项集挖掘算法被引量：4

Improved frequent itemset mining algorithm based on interval list

机构地区：[1]首都师范大学信息工程学院,北京100048 [2]北京交通大学交通运输学院,北京100044

出　　处：《计算机应用》2016年第4期997-1001,共5页journal of Computer Applications

基　　金：国家自然科学基金资助项目(61272029)~~

摘　　要：针对PrePost算法中需要建立复杂的前序和后序编码树(PPC-tree)和节点链表(N-list)的问题,提出一种基于间隔链表(I-list)改进的高效频繁项集挖掘算法。首先,该算法采用了比频繁模模式树(FP-tree)更加压缩的数据存储结构间隔编码的频繁模式树(IFP-tree),无需迭代地建立条件FP-tree;其次,该算法利用更简洁的I-list代替了PrePost中复杂的N-list,从而提高了建树和挖掘速度;最后,对于单分支路径的情况,该算法通过组合的方法,直接求得某些频繁项集,以提高算法的时间性能。实验结果表明:一方面,对于同一数据集在相同支持数下挖掘的结果相同,验证了改进算法的正确性;另一方面,无论在时间还是空间上改进算法的整体性能均比PrePost算法提高约10%;且对于稀疏型数据库或密集型数据库的挖掘都有较好的应用。Focusing on the problem that Pre Post algorithm needs to build complex Pre-order and Post-order Code tree（ PPC-tree） and Node list（ N-list）,an improved frequent itemset mining algorithm based on the Interval list（ I-list） was proposed. Firstly,data storage structure with more compression compared to Frequent Pattern tree（ FP-tree）,called Interval FP-tree（ IFP-tree）,was adopted,which mined frequent itemsets without iteratively establishing conditional tree. Secondly,the more concise method called I-list was used to replace the complex N-list in Pre Post so as to improve mining speed.Finally,in the case of single branch path,some frequent itemsets were directly obtained by the method of combination. The experimental results prove the correctness of the proposed algorithm by getting the same results for the same dataset under same minimum supports,the proposed algorithm is superior to Pre Post algorithm by about 10 percent in terms of time and space which has a good application in sparse database or intensive database.

关键词：数据挖掘关联规则频繁项集频繁模式树间隔链表

分类号：TP311.13[自动化与计算机技术—计算机软件与理论]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于间隔链表改进的频繁项集挖掘算法被引量：4

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于间隔链表改进的频繁项集挖掘算法 被引量：4

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

基于间隔链表改进的频繁项集挖掘算法被引量：4