基于MapReduce的DHP算法并行化研究  被引量:1

RESEARCH ON PARALLELISATION OF DHP ALGORITHM BASED ON MAPREDUCE

在线阅读下载全文

作  者:周国军[1] 吴庆军[1] Zhou Guojun;Wu Qingjun(School of Mathematics and Information Science, Yulin Normal University, Yulin 537000, Guangxi, China)

机构地区:[1]玉林师范学院数学与信息科学学院,广西玉林537000

出  处:《计算机应用与软件》2016年第6期47-50,91,共5页Computer Applications and Software

基  金:广西高校科学技术研究立项项目(LX2014300)

摘  要:针对DHP(direct hashing and pruning)算法对大数据挖掘关联规则存在执行时间过长、效率不高的问题,对DHP算法的并行化策略进行了研究。根据云计算平台Hadoop的MapReduce并行编程模型,设计了一种并行DHP算法,给出了算法的总体流程和Map函数、Reduce函数的算法描述。与DHP算法相比,并行算法利用了Hadoop集群强大的计算能力,提高了从大数据集中挖掘关联规则的效率。通过实例分析了并行DHP算法的执行过程,在多个数据集上进行了实验。实验结果表明:并行DHP算法对大数据具有较好的加速比和可扩展性。DHP algorithm is confronted with the problems in association rules mining for big data such as long execution time and low efficiency,etc. In order to solve the problems,we studied the parallelisation strategy of DHP algorithm. According to MapReduce parallel programming model of cloud computing platform Hadoop,we designed a parallel DHP algorithm, presented the overall flow of the algorithm and the algorithm descriptions of Map function and Reduce function. Compared with DHP algorithm, the parallel DHP algorithm makes use of the powerful computing capacity of Hadoop cluster, improves the efficiency of mining association rules from big data. We analysed the execution process of parallel DHP algorithm by example,and carried out experiments on a couple of datasets. Experimental results showed that the parallel DHP algorithm has good speedup and scalability on big data.

关 键 词:MAPREDUCE HADOOP DHP算法 关联规则 

分 类 号:TP311.1[自动化与计算机技术—计算机软件与理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象