检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:亓开元[1,2,3] 韩燕波[1] 赵卓峰[1] 马强[2,3]
机构地区:[1]北方工业大学云计算研究中心,北京100144 [2]中国科学院计算技术研究所,北京100190 [3]中国科学院大学,北京100190
出 处:《计算机集成制造系统》2013年第3期641-653,共13页Computer Integrated Manufacturing Systems
基 金:国家自然科学基金资助项目(60903137;60970132)~~
摘 要:为了在大规模历史感知数据基础上实现针对高速传感数据流的实时计算,提出一种面向大规模历史数据的数据流处理方法RTMR,通过中间结果缓存、流水化和本地化改进了MapReduce的数据流处理能力。在此基础上,为了适应性地构造RTMR集群,利用模型分析方法根据应用特征和集群环境配置节点类型和拓扑结构。为实现集群的负载均衡,通过计算负载状态转换关系分组空闲节点和过载节点,将NP难的动态负载均衡问题快速分解为规模较小的子问题,并且综合执行时间和数据移动代价作为子问题的优化目标,提高应对负载倾斜的反应速度。实验表明,上述方法和技术能够保障大规模历史数据上数据流处理的可伸缩性。With the development of Internet of Things, how to realize real time computation for high speed data stream based on large scale history sensor data became a new challenge to cloud manufacturing. A processing meth- od named Real-Time MapReduce (RTMR) oriented to large scale historical data was proposed, which improved data stream processing capacity of MapReduce through intermediate result cache, pipelining and localization. To con- struct RTMR sets, the model analysis method was used to configure the node type and topological structure based on application characteristics and cluster environments. Furthermore, to realize cluster load balancing, the idle nodes and overload nodes were grouped by computing load state transition relation. Thus the dynamic load balancing problem of NP hard was decomposed into small scale sub-problems, and execution time as well as data cost were in- tegrated as sub-problem's optimization objective. The experiment result showed that the proposed method and tech- nology could ensure the scalability for data stream processing of large scale historical data.
关 键 词:数据流处理 大规模数据处理 MapReduce方法 适应性架构 负载均衡
分 类 号:TP393[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.170