基于Spark的改进关联规则算法研究被引量：3

An improved algorithm of association rules based on the Spark

出　　处：《电子技术应用》2017年第6期126-129,共4页Application of Electronic Technique

基　　金：国家自然科学基金(41272374)

摘　　要：针对关联规则Apriori算法在信息爆炸时代面对海量数据时,其计算周期大、算法效率低等问题,将数据以特定的数据结构进行存储,降低数据遍历次数;在连接操作前进行剪枝操作,并且改变剪枝操作的判定条件;同时将改进算法IApriori与基于内存的大数据并行计算处理框架Apache Spark相结合,提出了一种基于Spark的Apriori改进算法(Spark+IAprior)。实验结果表明,Spark+IApriori算法在集群伸缩性和加速比方面都优于Apriori算法。Association rules Apriori algorithm have problems with large calculation cycle and low algorithm efficiency faced with huge amounts of data in the era of information explosion, data in a specific storage on the data structure to reduce the data on the number of times past, pruning operation before the items self-joins and changing the terms of judgment have been adopted in the paper, and the algorithm combined with Spark computing framework, an improved algorithm based on the Spark（Spark ＋IApriori） can be put forward. Experimental results show that the Spark＋IApriori algorithm has a good data scalability and speed ratio than Apriori.

关键词：关联规则 APRIORI MAPREDUCE HADOOP SPARK

分类号：TP391[自动化与计算机技术—计算机应用技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于Spark的改进关联规则算法研究被引量：3

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于Spark的改进关联规则算法研究 被引量：3

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

基于Spark的改进关联规则算法研究被引量：3