植物领域知识图谱构建中本体非分类关系提取方法  被引量:19

Research on Ontology Non-taxonomic Relations Extraction in Plant Domain Knowledge Graph Construction

在线阅读下载全文

作  者:赵明[1] 杜亚茹[1] 杜会芳[1] 张家军[1] 王红说[1] 陈瑛[1] 

机构地区:[1]中国农业大学信息与电气工程学院,北京100083

出  处:《农业机械学报》2016年第9期278-284,共7页Transactions of the Chinese Society for Agricultural Machinery

基  金:国家自然科学基金项目(61503386)

摘  要:采用本体学习的方法,以百度百科植物类词条内容的非结构和半结构化中文文本信息作为语料进行处理。使用一种有指导的基于依存句法分析的词汇-语法模式来获取植物领域的概念、分类和非分类关系,并分别利用基于词表过滤的方法和给模式添加限制的方法,较大程度地提高了关系抽取的精确度,完成在轻量级本体的基础上自动构建重量级本体。该方法建立了一个特定领域语料的概念层次,提高了最具代表性的分类和非分类关系的发现,并使用OWL语言形式化表达抽取结果。实验表明,该方法在非分类关系抽取上取得了较好的结果,为该领域知识图谱构建奠定了基础。In order to provide more specific knowledge and technology of plant field, the main task of KG (knowledge graph) is to extract a wealth of concepts and relationships. Due to the relation extraction is the most difficult in KG construction, this paper makes use of ontology learning, and proposes a non- taxonomic relation learning method to obtain representative concepts and their relations from unstructured and semi-structured texts of Baidu Encyclopedia entry content by using lexicon-syntactic patterns based on dependency grammar analysis. Moreover, the methods of adding constraint models and words filtering were adopted to build heavy weight ontology automatically based on a lightweight ontology and greatly improved the precision of the relation extraction. The approach established a concept structure from the plant domain corpus, ameliorated the discovery of the most representative non-taxonomic relation, and formalized them in the standardized OWL 2.0. A set of experiments was performed using the approach implemented in the plant domain. The results indicated that extraction by patterns should be performed directly after natural language processing, which has a comparatively high accuracy compared to the former algorithms, and this approach can extract non-taxonomic relations with high effectiveness, which lays the foundation for KG construction of plant field.

关 键 词:植物领域本体 知识图谱 非分类关系 本体学习 百度百科 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象