分布式并行关联规则挖掘算法研究被引量：13

RESEARCH ON DISTRIBUTED PARALLEL ASSOCIATION RULE MINING

机构地区：[1]金陵科技学院信息技术学院,江苏南京211169 [2]江苏省信息分析工程实验室,江苏南京211169

出　　处：《计算机应用与软件》2013年第10期113-115,119,共4页Computer Applications and Software

基　　金：江苏省现代教育技术研究项目(2011-R-19470);江苏省高校自然科学基金项目(11KJD520006)

摘　　要：关联规则挖掘算法FP-Growth虽然效率比Apriori要快一个数量级,但存在频繁模式树可能过大而内存无法容纳和数据挖掘过程串行处理等两大缺点。提出一种分布式并行关联规则挖掘算法,该算法针对分布式应用数据架构,不需要产生全局FPtree,避免全局FP-tree可能过大而内存无法容纳的问题,算法在各个主要步骤上都实现了并行处理。算法测试结果和分析表明,与传统的关联规则挖掘算法FP-Growth相比,该算法通过多节点分布式并行处理显著提高了执行效率和处理能力。In association rule mining, though the FP-Growth algorithm is approximately one order of magnitude faster than the Apriori algorithm, but it has two disadvantages： the first is that its frequent pattern tree may be too big to be created in the memory ; the second is its serial processing approach. In this paper we propose a kind of distributed parallel association rule mining algorithm. It is for the distributed applied data framework, does not need to create the global FP-tree so avoids the problem of too big the global FP tree that fills the memory to excess. In all its principal steps the algorithm achieves parallel processing. Test resuh and analysis of the algorithm show that compared with conventional association rule mining algorithm FP-Growth, this one significantly improves the executing efficiency and the processing ability by multi-node distributed parallel processing.

关键词：数据挖掘关联规则频繁模式并行算法

分类号：TP311[自动化与计算机技术—计算机软件与理论]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

分布式并行关联规则挖掘算法研究被引量：13

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

分布式并行关联规则挖掘算法研究 被引量：13

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

分布式并行关联规则挖掘算法研究被引量：13