检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
机构地区:[1]北方工业大学云计算研究中心,北京100144 [2]大规模流数据集成与分析技术北京市重点实验室,北京100144
出 处:《小型微型计算机系统》2017年第1期29-34,共6页Journal of Chinese Computer Systems
基 金:北京市教育委员会科技计划重点项目(KZ201310009009)资助;北京市属高等学校创新团队建设与教师职业发展计划基金项目(IDHT20130502)资助
摘 要:大部分存储集群构建时可能包含有遗留设备及新购置设备,这些设备在存储性能方面存在较大差异.采用HDFS默认的机架感知存储策略时,可能使访问频率高的数据存储在低性能节点上,而访问频率低的数据存储在高性能节点上,既影响集群响应时间,又降低了资源利用率.针对以上问题,提出一种分级存储调度机制.在HDFS机架感知调度策略基础上,首先根据节点的CPU、内存大小、磁盘大小、磁盘I/O等固有硬件性能将节点划分为高配置节点和低配置节点,其次根据节点的CPU使用率、内存使用率、网络带宽使用率、磁盘使用率等性能的动态因素建立节点的性能评价模型,并建立三个性能级别.根据节点配置情况、性能级别及网络位置等多方面因素进行综合调度.同时在集群运行过程中,会根据数据的访问频率对数据块的分布进行动态调整.实验结果表明,本文提出的分级存储调度机制可以在HDFS异构集群中提高数据的访问效率,优化集群性能.Most storage cluster may contain legacy devices and new purchased ones when building, and these devices are quite different in storage performance. When using default rack perception storage strategy of Hadoop Distributed File System ( HDFS ) ,it is possible to make a high frequency data stored on the low performance nodes,at the same rime,the low frequency data more likely to store on high performance node,then impact on the cluster response time, as well as reduces the resource utilization. To solve these headache problems,our team propose a hierarchical storage scheduling mechanism. On the basis of HDFS rack perception scheduling policy, Firstly in accordance with the node's CPU, memory size, disk size, disk I/O and other inherent hardware performance, dividing nodes into high configuration node and opposite of low configuration node;secondly according to the node's CPU usage,memory usage, net- work bandwidth usage, disk usage and other performance dynamic factors to establish performance evaluation model of the node, and to build three performance levels pl ,p2 ,p3 ,from high to low,to evaluate the performance of nodes. Making integrated scheduling ac- cording to the node configuration,performance levels,network location and other factors. According to the data access frequency to dy- namically adjust the distribution of the data block in the process of cluster running. The experimental results show that the new strategy of hierarchical storage scheduling mechanism could improve the data access efficiency in HDFS heterogeneous cluster,optimize cluste- ring performance.
分 类 号:TP301[自动化与计算机技术—计算机系统结构]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:18.119.0.207