基于图覆盖的大数据全比较数据分配算法  被引量:6

Data Allocation Algorithm for Large Data with All-to-all Comparison Based on Graph Covering

在线阅读下载全文

作  者:高燕军 张雪英[1] 李凤莲[1] 田玉楚[1,2] GAO Yanjun;ZHANG Xueying;LI Fenglian;TIAN Yuchu(College of Information Engineering,Taiyuan University of Technology,Jinzhong,Shanxi 030600,China;School of Electrical Engineering and Computer Science,Queensland University of Technology,Brisbane 4001,Australia)

机构地区:[1]太原理工大学信息工程学院,山西晋中030600 [2]昆士兰科技大学电机工程及计算机科学学院

出  处:《计算机工程》2018年第4期17-22,27,共7页Computer Engineering

基  金:山西省研究生联合培养基地人才培养项目(2017JD16);山西省优秀人才科技创新项目(201605D211021);2016年太原理工大学教改项目(24)

摘  要:在对大数据全比较问题进行分布式处理的过程中,现有的数据分配策略较少考虑比较任务和数据之间的特殊依赖关系,导致存储效率下降、任务分配不均衡。为此,提出基于图覆盖的数据分配算法。通过理论分析将大数据全比较的数据分配问题归纳为图覆盖问题,在此基础上构造图覆盖的最优解,根据特解分配数据。实验结果表明,与基于Hadoop的数据分配策略相比,该算法可确保比较任务具有100%的数据本地性,使节点之间达到负载均衡,并且提高存储节约率和整体计算性能。In the process of distributed processing of all-to-all comparison problem for large data,the existing data allocation strategies think less of the special dependency between the comparison task and the data,which lead to the low storage efficiency and imbalanced task allocation.Aiming at this problem,a Data Allocation Algorithm Based on Graph Covering(DAABGC)is proposed.Firstly,the problem of data allocation for large data is summarized as the problem of graph covering by theoretical analysis.Then,the optimal solution of several graph covering is constructed successfully and the data are allocated according to the special solution.Experimental results show that,compared with the Hadoop-based data allocation strategy,the proposed algorithm ensures that the comparison task has 100%data locality and load balancing between nodes.It also improve storage saving rate and overall computing performance.

关 键 词:分布式计算 大数据 全比较 数据分配 图覆盖 

分 类 号:TP301.6[自动化与计算机技术—计算机系统结构]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象