检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:洪炎[1] 张磊[1] 严加琪 HONG Yan;ZHANG Lei;YAN Jiaqi(College of Electrical and Information Engineering,Anhui University of Science and Technology,Huainan 232001,China)
机构地区:[1]安徽理工大学电气与信息工程学院,安徽淮南232001
出 处:《安庆师范大学学报(自然科学版)》2021年第2期20-25,共6页Journal of Anqing Normal University(Natural Science Edition)
基 金:国家自然科学基金青年科学基金项目(61501006);安徽省自然科学基金面上基金(1808085MF169);安徽高校自然科学研究项目(KJ2018A0086)。
摘 要:随着大数据时代的到来,增量关联规则挖掘已成为数据挖掘领域的热门话题。CAN-tree作为增量关联规则挖掘领域的重要算法,其按项目频次大小进行排序会使树(tree)的规模过大,降低算法效率。针对此问题,提出一种基于AP-CAN的增量关联挖掘算法,采用AP聚类思想将原始数据集按项目的支持度不同分为多个集群,修剪不满足最小支持度的集群,利用哈希头表替代项头表,并根据数据量对每条事务排序。实验结果表明,该方法可以显著削减CAN树的规模,降低项目查找时间,提高数据挖掘效率,在效率和稳定性上均优于现有的CAN-tree算法。With the advent of the era of big data,incremental association rule mining has become a hot topic in the field of data mining.CAN-tree is an important algorithm in the field of incremental association rule mining,while sorting by item frequency will make the tree scale too large and the algorithm efficiency low.To solve this problem,an incremental association mining algorithm based on AP-CAN is proposed.The algorithm adopts the idea of AP clustering to divide the original data set into multiple clusters according to the different support degree of the project,pruning the clusters that do not meet the minimum support degree,replacing the item head table with the hash head table,and sorting each transaction according to the data volume.Experimental results show that this method CAN significantly reduce the scale of CAN-tree,reduce the search time of items,improve the efficiency of data mining,and is better than the existing CAN-tree algorithm in efficiency and stability.
关 键 词:关联规则 数据挖掘 AP聚类 CAN-tree算法
分 类 号:TP301[自动化与计算机技术—计算机系统结构]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.186