一种基于索引数组的频繁项集高效挖掘算法被引量：1

An efficient frequent itemset mining algorithm based on index array

机构地区：[1]北京科技大学信息工程学院,北京100083 [2]上海水产大学信息学院,上海200090

出　　处：《高技术通讯》2008年第3期259-264,共6页Chinese High Technology Letters

基　　金：国家自然科学基金(60675030,60463003);中国博士后科学基金(20060390399)资助项目

摘　　要：为改进基于数据库垂直表示的频繁项集挖掘算法的性能,给出了用索引数组方法来改进计算性能的思路。提出了索引数组的概念及其计算方法,并提出了一种新的高效的频繁项集挖掘算法 Index-FIMiner。该算法大大减少了不必要的 tidset 求交及相应的频繁性判断操作,同时也论证了代表项可直接与其包含索引中的所有项集的组合进行连接,这些结果项集的支持度均与代表项的支持度相等,从而降低了这些频繁项集的处理代价,提高了算法的性能。实验结果表明,Index-FIMiner 算法具有较高的挖掘效率。To improve the performance of frequent itemset mining algorithms based on vertical representation of database, the idea of using index array to enhance the computing performance is presented. The concept of index array and the corresponding computing method are proposed. Then, a new efficient frequent itemset mining algorithm Index-FIMiner is presented. The algorithm avoids the redundant operations on intersections of tidsets and the corresponding frequency-checking greatly. Meanwhile, it is proved that the representative item can connect all the combinations of items in its subsume index directly, and all the resulting itemsets share the same supports as the representative item. Thus, the cost for processing this kind of itemsets is lowered, and the algorithm efficiency is improved. Experimental results show that IndexFIMiner is efficient.

关键词：数据挖掘关联规则频繁项集索引数组包含索引

分类号：TP311.13[自动化与计算机技术—计算机软件与理论]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

一种基于索引数组的频繁项集高效挖掘算法被引量：1

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

一种基于索引数组的频繁项集高效挖掘算法 被引量：1

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

一种基于索引数组的频繁项集高效挖掘算法被引量：1