基于种子自扩展的命名实体关系抽取方法被引量：25

Named Entity Relation Extraction Method Based on Seed Self-expansion

出　　处：《计算机工程》2006年第21期183-184,193,共3页Computer Engineering

基　　金：国家自然科学基金资助项目(60442005);教育部科学技术研究基金资助重点项目(105117)

摘　　要：命名实体间关系的抽取是信息抽取中的一个重要研究问题,该文提出了一种从大量的文本集合中自动抽取命名实体间关系的方法,找出了所有出现在同一句子内、词语之间的距离在一定范围之内的命名实体对,把它们的上下文转化成向量。手工选取少量具有抽取关系的命名实体对,把它们作为初始关系的种子集合,通过自学习,关系种子集合不断扩展。通过计算命名实体对和关系种子之间的上下文相似度来得到所要抽取的命名实体对。通过扩展关系种子集合的方法,抽取的召回率和准确率都得到了提高。该方法在对《人民日报》语料库的测试中,取得了加权平均值F-Score为0.813的效果。Named entity relation extraction is an important issue in inforlnation extraction, This paper proposes a special method that extracts named entity relation from large text rendezvous. It finds out the named entity pairs, which appear in the same sentences and the distances of them is under a certain value, and converts their contexts into vectors. It selects a few named entity pair instances that have the relation wanted to extract and make them as initial relation seed set, The relation seed set is extended automatically in sell-study process. It gets the named entity pairs, which have the relation wanted to extract, by calculating the similarity of context vectors between named entity pairs and relation seed set. By the method of bootstrapping, the recall and precision are enhanced. It verifies the method with the PFR corpora and achieves an average weighted F-Score of 0.813.

关键词：命名实体关系抽取自学习

分类号：TP18[自动化与计算机技术—控制理论与控制工程]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于种子自扩展的命名实体关系抽取方法被引量：25

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于种子自扩展的命名实体关系抽取方法 被引量：25

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

基于种子自扩展的命名实体关系抽取方法被引量：25