云环境下分布式文件系统负载均衡研究  被引量:13

Distributed File System Load Balancing in Cloud Environment

在线阅读下载全文

作  者:吴瑶瑶 杨庚[1,2] WU Yaoyao;YANG Geng(College of Computer Science, Nanjing University of Posts and Telecommunications, Nanjing 210023, China;Jiangsu Key Laboratory of Big Data Security & Intelligent Processing, Nanjing 210023, China)

机构地区:[1]南京邮电大学计算机学院,南京210023 [2]江苏省大数据安全与智能处理重点实验室,南京210023

出  处:《计算机工程与应用》2019年第10期67-72,224,共7页Computer Engineering and Applications

基  金:国家自然科学基金(No.61572263;No.61502251;No.61502243);江苏省高校自然科学研究项目(No.14KJB520031);中国博士后科学基金项目(No.2016M601859);江苏省自然科学基金面上项目(No.BK20161516)

摘  要:Hadoop分布式文件系统(Hadoop Distributed File System,HDFS)是一种适合在通用硬件上运行的低成本、高度容错性的分布式文件系统,能提供高吞吐量的数据访问,适合针对大规模数据集上的应用。然而,HDFS中还面临一些性能优化问题,如负载均衡不足。虽然Hadoop系统自带的负载均衡器可以实现均衡调整,但需要用户预先给出静态的阈值。为了解决阈值的固定性和主观性,通过对磁盘空间使用率、CPU利用率、内存利用率、磁盘I/O占用率、网络带宽占用率等参数的分析评估优化,形成对阈值的计算表达式,并通过理论分析和仿真实验对阈值的计算和负载均衡进行验证。实验结果表明,相比较Hadoop静态的输入阈值的算法,该方法达到了更好的平衡效果,提高了计算资源的利用率。Hadoop Distributed File System(HDFS)is a low-cost, highly fault-tolerant distributed file system that suitable for running on commodity hardware, and offers high-throughput data access for applications on large datasets. However,there are some performance optimization problems in HDFS, such as under-load balancing. Although Hadoop system comes with a load balancer to achieve balanced adjustment, but users need to give a static threshold in advance. In order to solve the fixed threshold and subjectivity, through the analysis, evaluation and optimization of disk space utilization,CPU utilization, memory utilization, the disk I/O occupancy rate, the network bandwidth occupancy rate and other parameters, this paper forms a calculating expression for a threshold, and through the theoretical analysis and simulation experiments, this paper verifies the threshold calculation and load balancing. The experimental results show that this method achieves a better balance effect and improves the utilization of computing resources compared with the Hadoop static input threshold algorithm.

关 键 词:云环境 Hadoop分布式文件系统(HDFS) 负载均衡 动态阈值 

分 类 号:TP399[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象