检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
出 处:《计算机科学与应用》2021年第6期1706-1716,共11页Computer Science and Application
摘 要:关联规则反映事物与其他事物之间的关联性,是数据挖掘领域研究的一个重要方面,关键概念包括支持度,置信度,提升度。在关联规则中,Apriori算法是其重要组成部分。传统的Apriori算法存在如多次扫描数据库,需要很大的I/O负载,以及产生大量冗余性的候选项集等瓶颈问题。因此,对Apriori算法进行改进,通过布尔矩阵进行行列压缩来减少扫描数据的规模,通过引用索引表的形式来替代生成候选项集,并且以Tried树的形式来对最后所生成的所有频繁项集进行查找,从而加快了计算置信度的时间,以此来解决其瓶颈问题。最终实验结果表明,改进后的算法相比于传统的算法,大大提高了Apriori算法的时间及空间效率。Association rules reflect the association between things and other things, which is an important aspect of data mining research. The key concepts include support, confidence and promotion. Apriori algorithm is an important part of association rules. The traditional Apriori algorithm has some bottleneck problems, such as scanning the database many times, requiring a lot of I/O load, and producing a large number of redundant candidate itemsets. Therefore, the Apriori algorithm is improved. The scale of scanning data is reduced by row and column compression of Boolean matrix. The candidate itemsets are generated by using index table instead. All frequent itemsets are searched in the form of tried tree, which speeds up the calculation time of confidence, so as to solve the bottleneck problem. The final experimental results show that the improved algorithm greatly improves the time and space efficiency of Apriori algorithm compared with the traditional algorithm.
关 键 词:关联规则 APRIORI改进算法 频繁项集 Tried树 索引表
分 类 号:TP3[自动化与计算机技术—计算机科学与技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:18.117.241.170