HDFS异构集群中的分级存储调度机制  被引量:5

Scheduling Strategy of Hierarchical Storage in Heterogeneous Cluster About HDFS

在线阅读下载全文

作  者:杨冬菊[1,2] 李青[1,2] 邓崇彬 

机构地区:[1]北方工业大学云计算研究中心,北京100144 [2]大规模流数据集成与分析技术北京市重点实验室,北京100144

出  处:《小型微型计算机系统》2017年第1期29-34,共6页Journal of Chinese Computer Systems

基  金:北京市教育委员会科技计划重点项目(KZ201310009009)资助;北京市属高等学校创新团队建设与教师职业发展计划基金项目(IDHT20130502)资助

摘  要:大部分存储集群构建时可能包含有遗留设备及新购置设备,这些设备在存储性能方面存在较大差异.采用HDFS默认的机架感知存储策略时,可能使访问频率高的数据存储在低性能节点上,而访问频率低的数据存储在高性能节点上,既影响集群响应时间,又降低了资源利用率.针对以上问题,提出一种分级存储调度机制.在HDFS机架感知调度策略基础上,首先根据节点的CPU、内存大小、磁盘大小、磁盘I/O等固有硬件性能将节点划分为高配置节点和低配置节点,其次根据节点的CPU使用率、内存使用率、网络带宽使用率、磁盘使用率等性能的动态因素建立节点的性能评价模型,并建立三个性能级别.根据节点配置情况、性能级别及网络位置等多方面因素进行综合调度.同时在集群运行过程中,会根据数据的访问频率对数据块的分布进行动态调整.实验结果表明,本文提出的分级存储调度机制可以在HDFS异构集群中提高数据的访问效率,优化集群性能.Most storage cluster may contain legacy devices and new purchased ones when building, and these devices are quite different in storage performance. When using default rack perception storage strategy of Hadoop Distributed File System ( HDFS ) ,it is possible to make a high frequency data stored on the low performance nodes,at the same rime,the low frequency data more likely to store on high performance node,then impact on the cluster response time, as well as reduces the resource utilization. To solve these headache problems,our team propose a hierarchical storage scheduling mechanism. On the basis of HDFS rack perception scheduling policy, Firstly in accordance with the node's CPU, memory size, disk size, disk I/O and other inherent hardware performance, dividing nodes into high configuration node and opposite of low configuration node;secondly according to the node's CPU usage,memory usage, net- work bandwidth usage, disk usage and other performance dynamic factors to establish performance evaluation model of the node, and to build three performance levels pl ,p2 ,p3 ,from high to low,to evaluate the performance of nodes. Making integrated scheduling ac- cording to the node configuration,performance levels,network location and other factors. According to the data access frequency to dy- namically adjust the distribution of the data block in the process of cluster running. The experimental results show that the new strategy of hierarchical storage scheduling mechanism could improve the data access efficiency in HDFS heterogeneous cluster,optimize cluste- ring performance.

关 键 词:云存储 HDFS 异构集群 分级存储 存储调度 

分 类 号:TP301[自动化与计算机技术—计算机系统结构]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象