Job schedulers for Big data processing in Hadoop environment: testing real-life schedulers using benchmark programs  被引量:2

Job schedulers for Big data processing in Hadoop environment: testing real-life schedulers using benchmark programs

在线阅读下载全文

作  者:Mohd Usama Mengchen Liu Min Chen 

机构地区:[1]Embedded and Pervasive Computing(EPIC) Lab, School of Computer Science and Technology, Huazhong University of Science and Technology, wuhon 430074, China

出  处:《Digital Communications and Networks》2017年第4期260-273,共14页数字通信与网络(英文版)

摘  要:At present, big data is very popular, because it has proved to be much successful in many fields such as social media, E-commerce transactions, etc. Big data describes the tools and technologies needed to capture, manage, store, distribute, and analyze petabyte or larger-sized datasets having different structures with high speed. Big data can be structured, unstructured, or semi structured. Hadoop is an open source framework that is used to process large amounts of data in an inexpensive and efficient way, and job scheduling is a key factor for achieving high performance in big data processing. This paper gives an overview of big data and highlights the problems and challenges in big data. It then highlights Hadoop Distributed File System (HDFS), Hadoop MapReduce, and various parameters that affect the performance of job scheduling algorithms in big data such as Job Tracker, Task Tracker, Name Node, Data Node, etc. The primary purpose of this paper is to present a comparative study of job scheduling algorithms along with their experimental results in Hadoop environment. In addition, this paper describes the advantages, disadvantages, features, and drawbacks of various Hadoop job schedulers such as FIFO, Fair, capacity, Deadline Constraints, Delay, LATE, Resource Aware, etc, and provides a comparative study among these schedulers.

关 键 词:Big Data Hadoop MapReduce HDFS Scheduler Classification Locality Benchmark 

分 类 号:TN929.53[电子电信—通信与信息系统]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象