领域学术本体概念等级关系抽取研究  被引量:16

Learning Concept Hierarchies from Scientific Articles for Ontology Construction

在线阅读下载全文

作  者:蒋婷[1] 孙建军[1] 

机构地区:[1]南京大学信息管理学院,南京210093

出  处:《情报学报》2017年第10期1080-1092,共13页Journal of the China Society for Scientific and Technical Information

基  金:国家社会科学基金重大招标项目"面向学科领域的网络信息资源深度聚合与服务研究"(12&ZD221)

摘  要:等级关系抽取是领域本体自动构建的必经阶段,目前研究主要集中在生物医学领域,此外还存在现有方法效率不高的问题。本文提出一种面向领域学术资源的概念等级关系抽取的方法。首先,从概念抽取阶段开始,将学术文献中的概念分为方法/任务/工具/资源类术语,采用层叠条件随机场与C-value和规则相结合的方法分别对各个术语类型进行抽取,得到初始的分类术语;其次,在已有的术语类别限定下,结合外部词库和基于Web的方法抽取等级关系概念对;最后,采用基于图的方法将概念对生成图模型,再利用图剪枝方法生成概念等级关系。实验采用领域学术文献语料集对所提出的方法进行验证,在概念抽取阶段抽取不同类型的术语,得到较高的准确率与召回率,并进行了等级关系概念对的抽取,最终生成了概念等级关系,实验证实了本文提出的方法的可行性与有效性。"Concept hierarchy learning" is an important topic in ontology learning. This topic is mainly researched in the biomedical field. Low efficiency is the main problem in research on taxonomic construction. This paper proposes a new framework for taxonomy extraction for domain scientific resources. First, in concept learning, concepts in scientific articles are classified into four categories, namely, methods, tasks, resources, and tools. Then, the terms of each category are extracted by using a combination of cascaded conditional random fields, C-value, and lexico-syntactic patterns. Second, within the limitation of the categories of the terms, the concepts of hyponym relationships are extracted by combining lexicon- and Web-based methods. Thereafter, a graph model is initialized from the relationships extracted; then, graph-pruning methods are applied, and finally, the taxonomies are generated. The proposed methods are experimentally verified based on the corpus of scientific articles. We achieved high precision and recall in concept learning. Then, we extract taxonomic relationships and generated taxonomies. The feasibility and effectiveness of the propped methods are established experimentally.

关 键 词:本体构建 等级关系抽取 术语抽取 

分 类 号:TP391.1[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象