基于改进的Apriori算法的关联规则分析  被引量:2

Analysis of Association Rules Based on Improved Apriori Algorithm

在线阅读下载全文

作  者:汪敏 朱习军[1] 

机构地区:[1]青岛科技大学信息科学技术学院,山东 青岛

出  处:《计算机科学与应用》2021年第6期1706-1716,共11页Computer Science and Application

摘  要:关联规则反映事物与其他事物之间的关联性,是数据挖掘领域研究的一个重要方面,关键概念包括支持度,置信度,提升度。在关联规则中,Apriori算法是其重要组成部分。传统的Apriori算法存在如多次扫描数据库,需要很大的I/O负载,以及产生大量冗余性的候选项集等瓶颈问题。因此,对Apriori算法进行改进,通过布尔矩阵进行行列压缩来减少扫描数据的规模,通过引用索引表的形式来替代生成候选项集,并且以Tried树的形式来对最后所生成的所有频繁项集进行查找,从而加快了计算置信度的时间,以此来解决其瓶颈问题。最终实验结果表明,改进后的算法相比于传统的算法,大大提高了Apriori算法的时间及空间效率。Association rules reflect the association between things and other things, which is an important aspect of data mining research. The key concepts include support, confidence and promotion. Apriori algorithm is an important part of association rules. The traditional Apriori algorithm has some bottleneck problems, such as scanning the database many times, requiring a lot of I/O load, and producing a large number of redundant candidate itemsets. Therefore, the Apriori algorithm is improved. The scale of scanning data is reduced by row and column compression of Boolean matrix. The candidate itemsets are generated by using index table instead. All frequent itemsets are searched in the form of tried tree, which speeds up the calculation time of confidence, so as to solve the bottleneck problem. The final experimental results show that the improved algorithm greatly improves the time and space efficiency of Apriori algorithm compared with the traditional algorithm.

关 键 词:关联规则 APRIORI改进算法 频繁项集 Tried树 索引表 

分 类 号:TP3[自动化与计算机技术—计算机科学与技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象