基于Hadoop的交互式大数据分析查询处理方法被引量：5

An Interactive Processing Method of Analysis and Query for Big Data Based on Hadoop

机构地区：[1]西安邮电大学,陕西西安710061 [2]陕西省信息化工程研究院,陕西西安710061

出　　处：《计算机技术与发展》2016年第8期134-137,142,共5页Computer Technology and Development

基　　金：2015陕西省信息化技术研究项目课题(2015-002)

摘　　要：基于Hadoop的交互式大数据分析查询处理方法旨在快速分析查询大数据集的信息,最重要的特征就是查询速度快。该方法能够运行在上千节点的集群上,适于半结构化/嵌套数据的分析、兼容现有的SQL环境和Apache Hive。文中主要利用此方法实现连接HDFS、Hive以及Hbase进行查询测试,还完成了同时从不同数据源上关联查询数据。在同一Hadoop集群环境中,将该方法与Spark SQL对于10万、20万、50万、100万、500万条数据进行查询速度对比测试。经过多次实验后得出,基于Hadoop的交互式大数据分析查询处理方法速度快、效率高,能够帮助企业用户快速、高效地进行Hadoop数据查询和企业级大数据分析。An interactive processing method of analysis and query of big data based on Hadoop aims to analyze and query large data fast, whose important feature is the rapid query speed. The method is able to run on a cluster with thousands of nodes, suitable for analyzing semi-structured or nested data,combining with existing SQL environment and Apache Hive. The main purpose is to use the method to connect HDFS, Hive and Hbase for query, also achieving to query data from different data sources. Furthermore,in the same Hadoop clus- tering environment,the method and Spark SQL is compared in the query speed for data with 100 000,200 000,500 000 ,one million and five million. Several experiments show the method is fast and efficient, and enables business users to query data and analyze enterprise Ha- doop big data quickly and efficiently.

关键词：HADOOP集群大数据处理交互式查询快速 SQL

分类号：TP302.1[自动化与计算机技术—计算机系统结构]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于Hadoop的交互式大数据分析查询处理方法被引量：5

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于Hadoop的交互式大数据分析查询处理方法 被引量：5

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

基于Hadoop的交互式大数据分析查询处理方法被引量：5