Hadoop分布式架构下大数据集的并行挖掘被引量：21

Parallel Mining of Large Dataset in Hadoop Distributed Computing Framework

出　　处：《计算机技术与发展》2014年第1期22-25,30,共5页Computer Technology and Development

基　　金：广西自然科学基金(2011GXNSFA018152);广西研究生教育创新计划项目(YCSZ2012007)

摘　　要：基于Hadoop分布式计算平台,给出一种适用于大数据集的并行挖掘算法。该算法对非结构化的原始大数据集以及中间结果文件进行垂直划分以确保能够获得完整的频繁项集,将各个垂直分块数据分配给不同的Hadoop计算节点进行处理,以减少各个计算节点的存储数据,进而减少各个计算节点执行交集操作的次数,提高并行挖掘效率。实验结果表明,给出的并行挖掘算法解决了大数据集挖掘过程中产生的大量数据通信、中间数据以及执行大量交集操作的问题,算法高效、可扩展。Based on Hadoop distributed computing framework,propose a parallel algorithm for mining the large dataset. The presented al- gorithm divides the original large non-structured dataset and large middle result flies into several smaller-scale data blocks by vertical partitioning pattern in order to ensure the completeness of the frequent item set. The algorithm can reduce the size of the data to be stored in each computing node and decrease the execution times that each computing node calculates the intersection operations by distributing the data blocks to the computing nodes to parallel mining in Hadoop distributed computing environment, and it can improve the efficiency of parallel mining. The experimental results show that the presented parallel mining algorithm can solve the problem that the mining large dataset will generate large amount of data communication and large number of operations for calculating intersection, and it is efficient and scalable.

关键词：数据挖掘大数据集并行算法 HADOOP

分类号：TP311.133.2[自动化与计算机技术—计算机软件与理论] TP338.6[自动化与计算机技术—计算机科学与技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

Hadoop分布式架构下大数据集的并行挖掘被引量：21

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

Hadoop分布式架构下大数据集的并行挖掘 被引量：21

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

Hadoop分布式架构下大数据集的并行挖掘被引量：21