检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
机构地区:[1]淮安信息职业技术学院计算机与通信工程学院,江苏淮安223003 [2]河南牧业经济学院信息与电子工程学院,郑州450044
出 处:《控制工程》2017年第9期1786-1791,共6页Control Engineering of China
摘 要:为提高频繁项集挖掘性能,提出了基于渐近取样的频繁项集挖掘近似算法(Frequent Itemsets Mining Approximate Algorithm based on Progressive Sampling,FIMAA-PS),该算法使用渐近取样方法实现数据集的样本提取,基于当前样本输出结果自动配置下一轮循环挖掘的样本大小,并使用Rademacher均值对输出结果的频率偏差上限进行理论估计从而得到终止条件,最后通过单次样本快速扫描判断算法终止条件,输出挖掘结果。实验结果表明,不同于传统挖掘精确算法和使用静态取样的挖掘近似算法,FIMAA-PS在输出结果精准度和运行时间方面具有显著优势。In order to improve the mining performance of frequent item sets, a frequent item set mining approximate algorithm based on progressive sampling (FIMAA-PS) is proposed. In FIMAA-PS process, it employs progressive sampling to extract the sample from the dataset, and then automatically configures the mining sample size during next iteration according to the current output, and then uses Rademacher average to compute the bound to frequency bias of output results to obtain the stopping condition. Finally, FIMAA-PS judges the stopping condition by single fast scanning of samples to output the mining results. The experimental result demonstrates that, different from the traditional mining exact algorithm and mining approximate algorithm based on static sampling, FIMAA-PS has a significant advantage in terms of accuracy and running time.
关 键 词:频繁项挖掘 近似算法 渐近取样 Rademacher均值
分 类 号:TP3[自动化与计算机技术—计算机科学与技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.15.7.155