领域科技文献创新点句中主题属性实例识别方法研究  被引量:9

Research on Recognition of Concept Attribute Instances in Innovation Sentences of Scientific Research Paper

在线阅读下载全文

作  者:张帆[1,2] 乐小虬[1] 

机构地区:[1]中国科学院文献情报中心,北京100190 [2]中国科学院大学,北京100049

出  处:《现代图书情报技术》2015年第5期15-23,共9页New Technology of Library and Information Service

基  金:"十二五"国家科技支撑计划重点项目子课题"基于文献知识网络的领域学术关系研究与示范"(项目编号:2011BAH10B06-04)的研究成果之一

摘  要:【目的】识别创新点句中主题属性实例,进一步挖掘创新点句中的知识关系。【方法】采用语义角色标注以及依存句法分析方法,借助领域本体中属性类目下主题词,从依存树中识别创新点句中的核心主题词以及属性实例;针对依存句法分析的特征,设计组合术语识别模块以及连接词关系识别模块以改善识别效果。【结果】创新点句中核心主题词识别的F值达到77.94%;创新点句中属性实例识别的平均F值在90%左右。【局限】使用Stanford依存句法分析工具对肿瘤领域进行句法分析造成的偏差影响识别效果;使用NCIt本体属性类别时,有待进一步过滤与规范。【结论】实验结果表明,该方法对领域创新点句中的主题属性实例具有较好的识别效果。[Objective] This article aims to extract concept attribute instances in innovation sentences, and then to explore the relationship between concepts. [Methods] A method of recognizing core concept and concept attribute instances from dependency tree is presented. This method is based on the results of semantic role labeling and dependency parsing, and takes advantage of property of classes in domain Ontology. Considering the feature of dependency parsing, a concept combination module and a conjunction relationship detection module are designed to improve the effect of concept attribute instances recognition. [Results] The results show that the F value of core concept recognition is 77.94%, and the average F value of concept attribute instances recognition is around 90%. [Limitations] Stanford parsing tool leads to wrong parsing results which may result in inaccurate recognition. The class of Properties or Attributes in NCIt is not well filtered and standardized. [Conclusions] This method can effectively extract core concepts and concept attribute instances in innovation sentences.

关 键 词:领域本体 语义角色标注 依存句法分析 属性实例 

分 类 号:TP391.1[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象