Enhanced Best Fit Algorithm for Merging Small Files

作　　者：Adnan Ali Nada Masood Mirza Mohamad Khairi Ishak

机构地区：[1]School of Electrical and Electronic Engineering,Universiti Sains Malaysia(USM),Nibong Tebal,Pulau Pinang,14300,Malaysia [2]University College,United Arab Emirates University,Al Ain,UAE

出　　处：《Computer Systems Science & Engineering》2023年第7期913-928,共16页计算机系统科学与工程（英文）

基　　金：This research was supported by the Universiti Sains Malaysia(USM)and the ministry of Higher Education Malaysia through Fundamental Research Grant Scheme(FRGS-Grant No:FRGS/1/2020/TK0/USM/02/1).

摘　　要：In the Big Data era,numerous sources and environments generate massive amounts of data.This enormous amount of data necessitates specialized advanced tools and procedures that effectively evaluate the information and anticipate decisions for future changes.Hadoop is used to process this kind of data.It is known to handle vast volumes of data more efficiently than tiny amounts,which results in inefficiency in the framework.This study proposes a novel solution to the problem by applying the Enhanced Best Fit Merging algorithm(EBFM)that merges files depending on predefined parameters(type and size).Implementing this algorithm will ensure that the maximum amount of the block size and the generated file size will be in the same range.Its primary goal is to dynamically merge files with the stated criteria based on the file type to guarantee the efficacy and efficiency of the established system.This procedure takes place before the files are available for the Hadoop framework.Additionally,the files generated by the system are named with specific keywords to ensure there is no data loss(file overwrite).The proposed approach guarantees the generation of the fewest possible large files,which reduces the input/output memory burden and corresponds to the Hadoop framework’s effectiveness.The findings show that the proposed technique enhances the framework’s performance by approximately 64%while comparing all other potential performance-impairing variables.The proposed approach is implementable in any environment that uses the Hadoop framework,not limited to smart cities,real-time data analysis,etc.

关键词：Big data Hadoop MapReduce small file HDFS

分类号：TP31[自动化与计算机技术—计算机软件与理论]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

Enhanced Best Fit Algorithm for Merging Small Files

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

Enhanced Best Fit Algorithm for Merging Small Files

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索