检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:刘文杰 杨海军 LIU Wenjie;YANG Haijun(School of Information Engineering,Lanzhou University of Finance and Economics,Lanzhou 730020,China)
出 处:《吉林大学学报(信息科学版)》2023年第2期329-337,共9页Journal of Jilin University(Information Science Edition)
基 金:甘肃省自然科学基金资助项目(18JR3RA216,21JR1RA283);甘肃省电子商务技术与应用重点实验室(兰州财经大学)开放基金资助项目(2018GSDZSW63A14)。
摘 要:由于在现有的闭频繁项集挖掘算法中,剪枝策略相对单一,大都是针对1-项集进行剪枝,对2-项集和n-项集(n≥3)的剪枝策略相对匮乏,而有效的剪枝策略可以提前发现并剪掉大量没有希望的项集,因此改进闭频繁项集的剪枝策略对此类算法效率的提升具有很大的帮助。为此在ESCS(Estimated Support Co-occurrence Structure)结构基础上,提出针对2-项集的ESCS剪枝策略,并应用其将经典闭频繁项集挖掘算法DCI_Closed(Direct Count Intersect Closed)改进为DCI_ESCS(Direct Count Intersect Estimated Support Co-occurrence Structure)算法,同时对ESCS剪枝策略的效果加以验证。在多个公开数据集上、不同最小支持度阈值下,对改进前后算法时间性能进行比较实验。实验结果表明,改进的DCI_ESCS算法在事务和项集较长的、较稠密的数据集上表现良好,时间效率均有一定程度的提高。In the existing researches on closed frequent item set mining algorithms, pruning strategies are relatively single, most of which are for 1-item set pruning, and there are relatively few pruning strategies for 2-item set and n-item set(n≥3). However, effective pruning strategies can find and cut off a large number of hopeless item sets in advance. Therefore, improving the pruning strategy of closed frequent item set is of great help to improve the efficiency of this kind of algorithm.On the basis of ESCS(Estimated Support Co-occurrence Structure) structure, an ESCS pruning strategy for 2-itemsets is proposed, and the classical closed frequent itemset mining algorithm DCI_Closed(Direct Count Intersect Closed) is improved to DCI_ESCS(Direct Count Intersect Estimated Support Co-occurrence Structure)algorithm, and the effect of ESCS pruning strategy is verified. On multiple public datasets and under different minimum support thresholds, experiments are conducted to compare the time performance of the algorithm before and after the improvement. The experimental results show that the improved DCI_ESCS algorithm performs well on long and dense data sets with long transaction and itemsets, and the time efficiency is improved to a certain extent.
分 类 号:TP301[自动化与计算机技术—计算机系统结构]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.33