ERE:基于半结构化Web页面的实体关系抽取系统被引量：2

ERE:Entity Relationship Extraction System Based on Semi-structured Web Pages

作　　者：余东[1] 李诺[1] 申德荣[1] 汤楠[1] 徐宏斌[1] 寇月[1] 于戈[1]

出　　处：《计算机与数字工程》2014年第9期1581-1586,1662,共7页Computer & Digital Engineering

基　　金：国家自然科学基金(编号:61033007);教育部博士点基金(编号:20120042110028);教育部-英特尔信息技术专项科研基金(编号:MOE-INTEL-2012-06)资助

摘　　要：传统的实体关系抽取方法主要针对语义信息较为完整的文本,基于抽取模式抽取文本中的实体关系,并采用启发式算法或者概率模型来选择抽取出的候选关系。而对于半结构化的页面,由于没有成句的实体信息展示,导致这些方法不能很好适用。论文提出的实体关系抽取系统能较好地处理半结构化的页面。该系统主要包括数据抽取规则学习、数据抽取、实体间关系计算等核心功能模块,并为用户提供了关系库查询接口。用户输入关键词和选定匹配类型,系统将根据关键词及匹配类型查询实体信息库,然后用满足条件的实体再去查询实体关系库,将包含这些实体的关系返回给用户。In traditional methods, researchers use extraction pattern to extract entity relationships in text fragments that have complete semantic information. And they use heuristic algorithms or probabilistic models to choose the extracted candidate relationships. As for the semi-structured web pages, these methods become less applicable because the information of the entities is shown in some html modules where the semantic information is not complete. In this paper, an entity relationship extraction system that can solve the problem perfectly is propsoed. The system is composed of four functional modules： data extraction rule learning module, data extraction module, entity relationship compute module and entity relationship base query module. Firstly, users give a key word and choose a matching type. And the system will query the entity information base and find some entities that meet the conditions. Then the system will query the entity relationship base with the entities founded previously. Finally, the relationships that contain the entities will be returned to users.

关键词：实体关系实体关系抽取数据抽取实体匹配

分类号：TP391[自动化与计算机技术—计算机应用技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

ERE:基于半结构化Web页面的实体关系抽取系统被引量：2

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

ERE:基于半结构化Web页面的实体关系抽取系统 被引量：2

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

ERE:基于半结构化Web页面的实体关系抽取系统被引量：2