检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:张强 张学文[2] ZHANG Qiang;ZHANG Xue-wen(School of Electrical Information,City College of Science and Technology,Chongqing University,Chongqing 402167,China;College of Mechanical Engineering,Beihua University,Jilin 132013,China)
机构地区:[1]重庆大学城市科技学院电气信息学院,重庆402167 [2]北华大学机械工程学院,吉林吉林132013
出 处:《计算机工程与设计》2021年第2期475-481,共7页Computer Engineering and Design
基 金:国家自然科学基金项目(60875064、61175102)。
摘 要:MapReduce进行大数据分布式计算时,数据集倾斜特性将导致子任务间完成时间差异明显,影响计算性能。提出基于布隆滤波二次拆分的处理算法。考虑硬件配置、计算资源等约束,提出基于布隆滤波的改进Hash分区函数,缩小映射操作的选择范围。在Reduce阶段设计学习自动机分区算法,使用均匀分布抽样获取任务类型与规模,以Reducer范围为约束条件搜索Mapper与Reducer的近似最优映射。与不考虑Reducer异构的常规算法相比,所提算法网络开销与负载变化率可降低2.4%和28%以上,负载均衡度则提升7.31%以上,表明所提算法具有更高的计算效率。In MapReduce,due to the skew feature of data sets,the completion time of each sub task is significantly different,which affects the calculation performance of the final big data processing.Hence,an algorithm with Bloom filter based secondary partition was proposed.Considering the constraints such as hardware configuration and computing resources,an improved Hash partition function based on Bloom filtering was proposed to narrow the selection range in Mapper stage.A partitioning algorithm based on learning automata was designed in the Reduce stage.The uniform distribution sampling algorithm was used to obtain the data task type and scale.The Reducer candidate range was used as the constraint to search the approximate optimal mapping relationship between Mapper and Reducer.The results of the example show that compared with the conventional algorithm which does not consider the heterogeneous characteristics of Reducer,the network overhead and load change rate are reduced by 2.4%and 28%respectively,and the load balance is improved by over 7.31%.Hence,the proposed algorithm has higher computational efficiency.
关 键 词:数据倾斜 布隆滤波 二次拆分 学习自动机 大数据处理 负载变化率
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.128.78.139