检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:王忠义[1] 沈雪莹 黄京[2] WANG Zhong-yi;SHEN Xue-ying;HUANG Jing(School of Information Management,Central China Normal University,Wuhan 430019,China;Wuhan Polytechnic,Wuhan 430072,China)
机构地区:[1]华中师范大学信息管理学院,湖北武汉430079 [2]武汉职业技术学院,湖北武汉430072
出 处:《情报科学》2021年第1期13-20,共8页Information Science
基 金:教育部人文社会科学研究青年基金“大数据环境下碎片化用户生成内容的多粒度知识组织研究”(19YJC870025)。
摘 要:【目的/意义】为准确抽取科技文献中的方法知识元,实现科技文献更细粒度知识组织和检索。【方法/过程】本研究提出一种基于规则的方法知识元抽取方法,该方法主要分为两个阶段:方法知识元初始描述规则半自动化识别阶段和方法知识元及其描述规则自动化抽取和更新阶段。第一阶段根据方法知识元的特征,以人工—机器相结合的方法识别方法知识元的组成维度及初始描述规则。第二阶段依据第一阶段识别的方法知识元初始描述规则,自动从科技文献中提取方法知识元,并基于PreFixSpan算法从新识别的方法知识元中挖掘出新的方法知识元描述规则,以实现方法知识元及其描述规则的动态更新。【结果/结论】在对16篇科技文献的初步评估中,实验结果P、R以及F值分别为0.71、0.80和0.73(均>0.5)表明该方法的可行性和有效性,该抽取方法对更细粒度的知识组织和检索也有一定借鉴作用。【创新/局限】方法的局限性在于需要一定的人工参与方法知识元描述规则的提取。【Purpose/significance】In order to accurately extract the method knowledge elements(KEs)in scientific literature and achieve more granular knowledge organization and retrieval.【Method/process】This study proposes a rule-based method for extracting method KEs in scientific literature.The method is divided into two stages:Semi-automated extraction stage of initial description rules of method KEs and automated derivation and update stage of method KEs along with their additional description rules.The former semi-automatically extracts initial method KEs based on the description characteristics of method KEs to get high-quality method KEs,and summarizes the composition dimensions and initial description rules finally.This stage provides the data foundation for the next stage,and also provides further insights into the composition dimensions of method KEs.The latter regards the initial rules as clue words,and uses regular expressions to extract the method KEs from text,and then derives additional rules by the PreFixSpan algorithm to supplement the initial rules.【Result/conclusion】In a preliminary evaluation on 16 papers,the P,R and F for the method KEs extraction are 0.71,0.80 and 0.73(>0.5)respectively,indicating the effectiveness of the method,and the method has certain reference effect for more granular knowledge organization and retrieval.【Innovation/limitation】The limitation of the method lies in the need of manual intervention in the extraction of the method knowledge elements description rules.
关 键 词:科技文献 方法知识元 描述规则 自动抽取 PREFIXSPAN
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.15