检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:邢富坤[1]
出 处:《计算机应用与软件》2012年第8期41-45,61,共6页Computer Applications and Software
基 金:国家自然科学基金项目(60872121)
摘 要:利用维基百科(Wikipedia)和已有命名实体资源,提出维基百科类的隶属度计算方法,通过匹配、计算、过滤、扩展、去噪五个步骤构建出具有较高质量和较大规模的命名实体实例集。在英语维基百科数据上进行实验,结果显示,基于隶属度方法自动获取的人名实例规模较DBpedia抽取出的人名实例规模高出近10倍,通过对不同隶属度区间的抽取实例进行人工检验,发现抽取出的前15000个维基百科类的准确率达到99%左右,能够有效支持命名实体类实例的扩充。This article presents a new approach, which is named as membership calculation approach for Wikipedia class, by make use of Wikipedia and the named entity resources. This approach constructs through five steps such as matching, computation, filtering, expanding and dealing with noises the named entities instances set with higher quality and larger scale. We have made some experiments on English Wikipedia and the results show that the scale of the personal names instances automatically acquired based on membership approach is 10 times larger than the name instances extracted in DBpedia. By manual checking the extracted instances from different membership interval, we found the precision rate of top 15000 Wikipedia classes extracted is as high as 99% , which is good enough to support expanding the named entity instances.
分 类 号:TP311[自动化与计算机技术—计算机软件与理论]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.80