检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:翟悦[1] 王璨[1] 孙建言 Zhai Yue;Wang Can;Sun Jianyan(Department of Information Science,Dalian Institute of Science and Technology,Dalian 116052,Liaoning,China)
机构地区:[1]大连科技学院信息科学系,辽宁大连116052
出 处:《计算机应用与软件》2018年第9期67-72,共6页Computer Applications and Software
摘 要:针对在海量数据中频繁项集挖掘耗时问题,近年来提出的N-List结构可有效提高挖掘效率。基于N-List提出一种新的频繁项集挖掘算法HNSFI(Hash table and subsume frequent itemsets mining based on N-List)。该算法利用PPC-tree生成N-List,引入哈希表存储N-List表示的项集,加快N-List相交操作运算时间;引入包含因子概念,利用其性质通过组合方法可以直接生成部分频繁项集,进一步提高算法时间性能。在三种不同的数据集上对该算法进行了测试和分析,实验结果表明在稠密数据集中该算法的时间性能是最优的。Aiming at the time-consuming problem of mining frequent itemsets in massive data, the N-List structure proposed in recent years can effectively improve the efficiency of mining. In this paper, we presented HNSFI (hash table and subsume frequent itemsets mining based on N-List). The algorithm used PPC-tree to generate N-List, and introduced hash table to store itemsets represented by N-List to speed up N-List interleaving operation time. By introducing the concept of subsuming index and using its properties, some frequent itemsets can be generated directly by combinatorial method, which further improves the time performance. The algorithm was tested and analyzed on three different datasets. Experimental results show that the time performance of the algorithm is optimal in dense data sets.
分 类 号:TP301.06[自动化与计算机技术—计算机系统结构]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.229