检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
出 处:《情报学报》2017年第10期1080-1092,共13页Journal of the China Society for Scientific and Technical Information
基 金:国家社会科学基金重大招标项目"面向学科领域的网络信息资源深度聚合与服务研究"(12&ZD221)
摘 要:等级关系抽取是领域本体自动构建的必经阶段,目前研究主要集中在生物医学领域,此外还存在现有方法效率不高的问题。本文提出一种面向领域学术资源的概念等级关系抽取的方法。首先,从概念抽取阶段开始,将学术文献中的概念分为方法/任务/工具/资源类术语,采用层叠条件随机场与C-value和规则相结合的方法分别对各个术语类型进行抽取,得到初始的分类术语;其次,在已有的术语类别限定下,结合外部词库和基于Web的方法抽取等级关系概念对;最后,采用基于图的方法将概念对生成图模型,再利用图剪枝方法生成概念等级关系。实验采用领域学术文献语料集对所提出的方法进行验证,在概念抽取阶段抽取不同类型的术语,得到较高的准确率与召回率,并进行了等级关系概念对的抽取,最终生成了概念等级关系,实验证实了本文提出的方法的可行性与有效性。"Concept hierarchy learning" is an important topic in ontology learning. This topic is mainly researched in the biomedical field. Low efficiency is the main problem in research on taxonomic construction. This paper proposes a new framework for taxonomy extraction for domain scientific resources. First, in concept learning, concepts in scientific articles are classified into four categories, namely, methods, tasks, resources, and tools. Then, the terms of each category are extracted by using a combination of cascaded conditional random fields, C-value, and lexico-syntactic patterns. Second, within the limitation of the categories of the terms, the concepts of hyponym relationships are extracted by combining lexicon- and Web-based methods. Thereafter, a graph model is initialized from the relationships extracted; then, graph-pruning methods are applied, and finally, the taxonomies are generated. The proposed methods are experimentally verified based on the corpus of scientific articles. We achieved high precision and recall in concept learning. Then, we extract taxonomic relationships and generated taxonomies. The feasibility and effectiveness of the propped methods are established experimentally.
分 类 号:TP391.1[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.143.235.3