中文农业科技文献自动标引系统SDIC/CASDAIS  被引量:3

SDIC/CASDA IS: An Automatic lndexing System for Chinese Document of Agriculture Science and Technology

在线阅读下载全文

作  者:王继华[1] 王怀惠[1] 吴泽宜[1] 

机构地区:[1]中国农业科学院科技文献信息中心

出  处:《情报学报》1995年第5期329-334,共6页Journal of the China Society for Scientific and Technical Information

摘  要:本文介绍了一个中文农业文献自动标引系统SDIC/CASDAIS,它集自动主题标引与自动分类标引于一体,采用主题词表、预匹配词表和停用词表相结合的词典法方案,匹配中采取正向增字跳字最长匹配的算法,末二字回溯,制订大量规则以降低错标。该系统可完成主题标引和分类标引,能处理农业文献中常见的缩略语和科技术语不规范现象,具备动态构词功能。SDIC/CASDAIS系统采用特征词析取方法处理不包含在词表中的品种、物质名称和地名等关键词,其自由词判定规则还可以判别标题的部分自由词,通过词频统计可作为更新词表的依据。SDIC/CASDAIS系统的标引速度为3000条标题/小时,平均标引深度略大于4,主题标引精度98%,分类标引基本吻合率80%。An automatic indexing system for Chinese document of agriculture science and technology,SDIC/CASDAIS,is discussed in this paper, As a dicti-onary method based system,SDIC/CASDAIS uses a subject word dictionary,a st-op word dictionary,and a so-called prematch word dictionary,adopts Direct Ch-aracter Changable Maximum method in word matching,recalls for the last two characters.Knowledge rules are used in SDIC/CASDAIS in order to reduce error indexing,SDIC/CASDAIS combines classification indexing with subject word in-dexing, can solve abbreviation words and uncanonical technical terms which used widely in agriculture literatures,and has the ability of dynamic word construction. SDIC/CASDAIS developed a characteristic word dissect method to index keywo-rds which not included in dictionary,such as organism variety name, place name, chemical substance name, etc, and, depending on it,s free word judgement rules,free words in title can also be indexed.The index speed of SDIC/CASDAIS is 3000 titles per hour,average index depth is 4, precision of subject word index is near 98%,and coincide ratio of classification index is 80%.

关 键 词:农业文献 自动标引系统 文献标引 SDIC CASDAIS 

分 类 号:G254-39[文化科学—图书馆学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象