基于隶属度的命名实体自动获取研究  被引量:1

STUDY ON AUTOMATIC ACQUISITION OF NAMED ENTITY BASED ON MEMBERSHIP

在线阅读下载全文

作  者:邢富坤[1] 

机构地区:[1]解放军外国语学院,河南洛阳471003

出  处:《计算机应用与软件》2012年第8期41-45,61,共6页Computer Applications and Software

基  金:国家自然科学基金项目(60872121)

摘  要:利用维基百科(Wikipedia)和已有命名实体资源,提出维基百科类的隶属度计算方法,通过匹配、计算、过滤、扩展、去噪五个步骤构建出具有较高质量和较大规模的命名实体实例集。在英语维基百科数据上进行实验,结果显示,基于隶属度方法自动获取的人名实例规模较DBpedia抽取出的人名实例规模高出近10倍,通过对不同隶属度区间的抽取实例进行人工检验,发现抽取出的前15000个维基百科类的准确率达到99%左右,能够有效支持命名实体类实例的扩充。This article presents a new approach, which is named as membership calculation approach for Wikipedia class, by make use of Wikipedia and the named entity resources. This approach constructs through five steps such as matching, computation, filtering, expanding and dealing with noises the named entities instances set with higher quality and larger scale. We have made some experiments on English Wikipedia and the results show that the scale of the personal names instances automatically acquired based on membership approach is 10 times larger than the name instances extracted in DBpedia. By manual checking the extracted instances from different membership interval, we found the precision rate of top 15000 Wikipedia classes extracted is as high as 99% , which is good enough to support expanding the named entity instances.

关 键 词:命名实体 自动获取 维基百科 隶属度 

分 类 号:TP311[自动化与计算机技术—计算机软件与理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象