检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:Chao KONG Ming GAO Chen XU Yunbin FU Weining QIAN Aoying ZHOU
机构地区:[1]School of Data Science and Engineering,East China Normal University,Shanghai 200062,China [2]Technische Universitat Berlin,Berlin 10623,Germany
出 处:《Frontiers of Computer Science》2019年第1期157-169,共13页中国计算机科学前沿(英文版)
基 金:the National Key Research and Development Program of China (2016YFB1000905);the National Natural Science Foundation of China (Grant Nos.U1401256, 61402177,61672234,61402180 and 61232002);NSF of Shanghai (14ZR1412600).
摘 要:Entity alignment is the problem of identifying which entities in a data source refer to the same real-world entity in the others.Identifying entities across heterogeneous data sources is paramount to many research fields,such as data cleaning,data integration,.information retrieval and machine learning.The aligning process is not only overwhelmingly expensive for large data sources since it involves all tuples from two or more data sources,but also need to handle heterogeneous entity attributes.In this paper,we propose an unsupervised approach,called EnAli,to match entities across two or more heterogeneous data sources.EnAli employs a generative probabilistic model to incorporate the heterogeneous entity attributes via employing exponential family,handle missing values,and also utilize the locality sensitive hashing schema to reduce the candidate tuples and speed up the aligning process.EnAli is highly accurate and efficient even without any ground-truth tuples.We illustrate the performance of EnAli on re-identifying entities from the same data source,as well as aligning entities across three real data sources.Our experimental results manifest that our proposed approach outperforms the comparable baseline.
关 键 词:ENTITY ALIGNMENT EXPONENTIAL family LOCALITY sensitive HASHING EM-algofithm
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.7