检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:杨程 陆佳民[1] 冯钧[1] YANG Cheng;LU Jiamin;FENG Jun(College of Computer and Information,Hohai University,Nanjing Jiangsu 211100,China)
出 处:《计算机应用》2020年第11期3184-3191,共8页journal of Computer Applications
基 金:国家重点研发计划项目(2017YFC0405806,2018YFC0407901)。
摘 要:随着知识图谱的日益发展和在各个垂直领域的广泛应用,对于资源描述框架(RDF)数据的高效处理需求日益成为现代大数据管理领域中的新课题。RDF是W3C提出的用于描述知识图谱实体以及实体间关系的数据模型。为了有效地应对大规模RDF数据的存储和查询,很多学者考虑在分布式环境中管理RDF数据。RDF数据的分布式存储所面临的关键问题是数据的划分,而划分的结果很大程度上决定了SPARQL的查询性能。从数据划分的角度,主要围绕两类:基于图结构的RDF数据划分方法和基于语义的RDF数据划分方法展开深入阐述。前者包括多粒度层次划分、模板划分和聚类划分,适用于通用领域查询的语义范畴较为宽泛的场景;后者包括哈希划分、垂直划分和模式划分,更加适用于垂直领域查询的语义范畴相对固定的环境。此外,针对几种典型的划分方法进行对比与分析,为未来RDF数据划分方法的研究提供参考。最后,对未来RDF数据划分方法的发展方向进行了归纳总结。With the rapid development of knowledge graph and its wide usage in various vertical domains,the requirements for efficient processing of Resource Description Framework(RDF)data has increasingly become a new topic in the field of modern big data management.RDF is a data model proposed by W3C to describe knowledge graph entities and inter-entity relationships.In order to effectively cope with the storage and query of the large-scale RDF data,many scholars consider managing RDF data in a distributed environment.The key problem faced by the distributed storage of RDF data is data partitioning,and the performance of Simple Protocol and RDF Query Language(SPARQL)queries is largely determined by the results of partitioning.From the perspective of data partitioning,two types:graph structure-based RDF data partitioning methods and semantics-based RDF data partitioning methods,were mainly focused on and described in depth.The former include multi-granularity hierarchical partitioning,template partitioning and clustering partitioning,and are suitable for the wide semantic categories scenes of general domain query,while the latter include hash partitioning,vertical partitioning and pattern partitioning,and are more suitable for the environments of the relatively fixed semantic categories of vertical domain query.In addition,several typical partitioning methods were compared and analyzed to provide enlightenment for the future research on RDF data partitioning methods.Finally,the future research directions of RDF data partitioning methods were summarized.
关 键 词:资源描述框架 数据划分 分布式RDF数据存储 SPARQL查询 分布式数据库
分 类 号:TP311.133.1[自动化与计算机技术—计算机软件与理论]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.171