开源关系数据库集群的并行空间连接算法实现  被引量:3

Research and Realization on Parallel Spatial Join Query Algorithm Based on Open Source RDBMS Cluster

在线阅读下载全文

作  者:范协裕[1] 任应超[2] 

机构地区:[1]福建农林大学资源与环境学院,福州350002 [2]中国科学院遥感与数字地球研究所,北京100011

出  处:《计算机系统应用》2016年第10期233-239,共7页Computer Systems & Applications

基  金:国家高技术研究计划(863)(2013AA12A403)

摘  要:当前对并行空间连接查询的研究主要集中在算法设计上,缺少在并行关系数据库管理系统上的应用实现研究.通过分析并行空间连接算法流程,利用开源并行关系数据库集群项目PL/Proxy,提出了混合式计算迁移模式并扩展了对空间操作的支持,并在其上实现了可扩展的基于空间划分的并行空间连接算法.通过真实数据的实验表明:设计实现的并行空间连接算法在空间数据划分负载均衡的情况下,可实现近线性的加速比;而在空间划分产生数据倾斜严重的情况下,仍具有一定的加速比,同时具备针对空间划分方案改进的可扩展能力.算法的实现方式为进行并行空间数据管理研究提供了一种可行的解决方案.Existing studies on parallel spatial join query mostly focus on algorithm process. Few of them pay enough attentions on implementation and application research. After analyzing existing algorithm process, parallel spatial join algorithms are divided into four phases which are parallel candidate tasks generating, assignment, executing and results collection. Each of them is designed and implemented on a parallel RDBMS cluster which is built on a open-source project named ‘PL/Proxy'. In addition, to implement parallel computation, mixed computation migration method is proposed.Spatial extension function is implemented to support the spatial data sets operation on the cluster. Result of experiments using real data sets shows that, the implemented algorithm gains near linear speedup when data declustering scheme is optimal. Moreover, speedup is also gained while significant data skew caused by data declustering exists. The data declustering scheme is replaceable for improving the algorithm performance. A practicable solution for parallel spatial data sets management is provided in this paper.

关 键 词:空间数据划分 并行空间连接查询 计算迁移 并行关系数据库 

分 类 号:TP311.13[自动化与计算机技术—计算机软件与理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象