HDFS中高效存储小文件的方法  被引量:10

Efficient method for storing small files in HDFS

在线阅读下载全文

作  者:尹颖[1] 林庆[1,2] 林涵阳 

机构地区:[1]江苏大学计算机科学与通信工程学院,江苏镇江212013 [2]南京理工大学计算机系,江苏南京210094 [3]江苏实达迪美数据处理有限公司,江苏昆山215332

出  处:《计算机工程与设计》2015年第2期406-409,共4页Computer Engineering and Design

摘  要:为改善应用Hadoop分布式文件系统存储大量小文件时效率低下的问题,将NameNode职责分离,使用单独的NFS服务器同步存储元数据信息,以降低Client数据请求压力,提供大吞吐量数据访问并改善访问延迟;设计文件与数据块的对应模式,允许在同一块中存储多个小文件,并对系统加以实现,为海量小文件的存储提供了一个有效的解决方案。实验结果表明,该机制可以在数据迅速增长的背景下实现海量小文件的高效存取。The HDFS is designed for the large file storage of the GB and the TB-level,which can not efficiently store large amounts of small files.By separating the NameNode duties and using a separate NFS server storing the metadata synchronization information,the data request pressure from the Client was reduced,the high throughput data access was provided and the access latency was improved.The corresponding modes for files and data blocks were designed that allowed multiple small files stored in the same block.The system was implemented,so as to provide an effective solution to the mass of small files storage.Experimental results show that this mechanism can realize the reliable massive small files access efficiently in a data rapidly growing background.

关 键 词:HADOOP分布式文件系统 海量小文件 性能优化 职责分离 合并小文件 

分 类 号:TP338.8[自动化与计算机技术—计算机系统结构]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象