VDoc+:a virtual document based approach for matching large ontologies using MapReduce 被引量：4

VDoc+:a virtual document based approach for matching large ontologies using MapReduce

机构地区：[1]State Key Laboratory for Novel Software Technology,Nanjing University,Nanjing 210093,China [2]Department of Computer Science and Technology,Nanjing University,Nanjing 210093,China

出　　处：《Journal of Zhejiang University-Science C(Computers and Electronics)》2012年第4期257-267,共11页浙江大学学报C辑（计算机与电子（英文版）

基　　金：supported by the National Natural Science Foundation of China (No.61003018);the Natural Science Foundation of Jiangsu Province,China (No.BK2011189);the National Social Science Foundation of China (No.11AZD121)

摘　　要：Many ontologies have been published on the Semantic Web,to be shared to describe resources.Among them,large ontologies of real-world areas have the scalability problem in presenting semantic technologies such as ontology matching(OM).This either suffers from too long run time or has strong hypotheses on the running environment.To deal with this issue,we propose a three-stage MapReduce-based approach V-Doc+ for matching large ontologies,based on the MapReduce framework and virtual document technique.Specifically,two MapReduce processes are performed in the first stage to extract the textual descriptions of named entities(classes,properties,and instances) and blank nodes,respectively.In the second stage,the extracted descriptions are exchanged with neighbors in Resource Description Framework(RDF) graphs to construct virtual documents.This extraction process also benefits from the MapReduce-based implementation.A word-weight-based partitioning method is proposed in the third stage to conduct parallel similarity calculation using the term frequency-inverse document frequency(TF-IDF) model.Experimental results on two large-scale real datasets and the benchmark testbed from Ontology Alignment Evaluation Initiative(OAEI) are reported,showing that the proposed approach significantly reduces the run time with minor loss in precision and recall.Many ontologies have been published on the Semantic Web, to be shared to describe resources. Among them, large ontologies of real-world areas have the scalability problem in presenting semantic technologies such as ontology matching （OM）. This either suffers from too long run time or has strong hypotheses on the running environment. To deal with this issue, we propose a three-stage MapReduce-based approach V-Doc＋ for matching large ontologies, based on the MapReduce framework and virtual document technique. Specifically, two MapReduce processes are performed in the first stage to extract the textual descriptions of named entities （classes, properties, and instances） and blank nodes, respectively. In the second stage, the extracted descriptions are exchanged with neighbors in Resource Description Framework （RDF） graphs to construct virtual documents. This extraction process also benefits from the MapReduce-based implementation. A word-weight-based partitioning method is proposed in the third stage to conduct parallel similarity calculation using the term frequency-inverse document frequency （TF-IDF） model. Experimental results on two large-scale real datasets and the benchmark testbed from Ontology Alignment Evaluation Initiative （OAEI） are reported, showing that the proposed approach significantly reduces the run time with minor loss in precision and recall.

关键词：Ontology matching Virtual document MAPREDUCE TF-IDF Semantic Web

分类号：TP311[自动化与计算机技术—计算机软件与理论]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

VDoc+:a virtual document based approach for matching large ontologies using MapReduce 被引量：4

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

VDoc+:a virtual document based approach for matching large ontologies using MapReduce 被引量：4

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索