中文报纸文献标引知识库设计与构建  被引量:1

Design and Construction of Knowledge Base for Indexing Chinese Newspaper Literatures

在线阅读下载全文

作  者:薛春香[1] 

机构地区:[1]南京理工大学信息管理系,江苏南京210094

出  处:《情报科学》2013年第7期121-125,共5页Information Science

基  金:教育部人文社会科学研究基金青年项目(09YJC870014);江苏省社会科学基金青年项目(09TQC011)

摘  要:报纸文献主题标引、分类标引和命名实体抽取是其内容深加工的主要形式,基于知识库的自动标引是报纸文献标引自动化的一种实现方式。在报纸文献自动标引研究现状基础上提炼出报纸文献自动标引一般流程,提出知识库建设是其实现自动标引的前提。结合报纸文献标引的特点,提出报纸文献标引用知识库应由主题标引库、分类知识库和实体标引库三部分多个词表组成,具有多词表融合、规模大、可扩充、简单易行等特点。同时,就知识库构建中的主题规范表、分类主题对照表和命名实体抽取规则库建设等关键技术进行阐述。Subject indexing, categorization and named entity extraction of newspaper literature are the main forms for its deep content processing. It is a major method that realizes automatic indexing the news- paper literature based on knowledge base. The general flow of automatic indexing for newspaper literature was figured based on the survey of its state of the art. From the flow, it could be found that the construc- tion of knowledge base is the premise of automatic indexing. The knowledge base was composed of subject indexing base, classification base and named entity extraction base which including many vocabularies and word lists. The characteristics of knowledge bases were analyzed in the paper. At last, the key tech- niques, such as the construction of vocabulary for subject control, cross concordances of class numbers and keyword strings and extraction rules for named entity, were expounded.

关 键 词:报纸文献 自动标引 分类标引 知识库 

分 类 号:G254[文化科学—图书馆学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象