基于渐近取样的频繁项集挖掘近似算法被引量：2

Research of Frequent Items Mining Approximate Algorithm Based on Progressive Sampling

机构地区：[1]淮安信息职业技术学院计算机与通信工程学院,江苏淮安223003 [2]河南牧业经济学院信息与电子工程学院,郑州450044

出　　处：《控制工程》2017年第9期1786-1791,共6页Control Engineering of China

摘　　要：为提高频繁项集挖掘性能,提出了基于渐近取样的频繁项集挖掘近似算法(Frequent Itemsets Mining Approximate Algorithm based on Progressive Sampling,FIMAA-PS),该算法使用渐近取样方法实现数据集的样本提取,基于当前样本输出结果自动配置下一轮循环挖掘的样本大小,并使用Rademacher均值对输出结果的频率偏差上限进行理论估计从而得到终止条件,最后通过单次样本快速扫描判断算法终止条件,输出挖掘结果。实验结果表明,不同于传统挖掘精确算法和使用静态取样的挖掘近似算法,FIMAA-PS在输出结果精准度和运行时间方面具有显著优势。In order to improve the mining performance of frequent item sets, a frequent item set mining approximate algorithm based on progressive sampling （FIMAA-PS） is proposed. In FIMAA-PS process, it employs progressive sampling to extract the sample from the dataset, and then automatically configures the mining sample size during next iteration according to the current output, and then uses Rademacher average to compute the bound to frequency bias of output results to obtain the stopping condition. Finally, FIMAA-PS judges the stopping condition by single fast scanning of samples to output the mining results. The experimental result demonstrates that, different from the traditional mining exact algorithm and mining approximate algorithm based on static sampling, FIMAA-PS has a significant advantage in terms of accuracy and running time.

关键词：频繁项挖掘近似算法渐近取样 Rademacher均值

分类号：TP3[自动化与计算机技术—计算机科学与技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于渐近取样的频繁项集挖掘近似算法被引量：2

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于渐近取样的频繁项集挖掘近似算法 被引量：2

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

基于渐近取样的频繁项集挖掘近似算法被引量：2