基于歧义消解的文献主题标引系统研究与实现  被引量:1

Ambiguity resolution-based subject indexing system and its implementation

在线阅读下载全文

作  者:夏光辉[1] 李晓瑛[1] 李阳[1] 崔新蝶 冀玉静[1] 李军莲[1] XIA Guang-hui;LI Xiao-ying;LI Yang;CUI Xin-die;JI Yu-jing;LI Jun-lian(Institute of Medical Information,Chinese Academy of Medical Sciences,Beijing 100020,China)

机构地区:[1]中国医学科学院医学信息研究所,北京100005

出  处:《中华医学图书情报杂志》2021年第3期58-65,共8页Chinese Journal of Medical Library and Information Science

基  金:国家科技图书文献中心专项任务“NSTL主题标引系统优化研究与应用”(2020XM49);国家科技图书文献中心先期研发任务“文本知识对象语义标注研究”(XQYF0201)。

摘  要:目的:通过主题标引的歧义消解机制,有效过滤歧义概念,从而提高文献主题自动标引的准确性。方法:基于《STKOS超级词表》,构建《国家科技图书文献中心期刊分类-STKOS范畴对应表》,通过概念与文献的领域一致性原则过滤歧义概念,结合标注词典生成、术语原形化、通用概念过滤、概念遴选等过程优化外文科技文献主题标引系统。结果:在科技期刊数据集上主题标引评测的准确率为77.53%,召回率为73.25%,F值为75.33%。结论:通过歧义消解机制能够有效提高外文科技文献的主题标引效果。Objective To improve the accuracy of automatic subject indexing by effective filtering the ambiguous concepts through the ambiguity resolution mechanism of subject indexing.Methods The corresponding subject category index of Science and Technology Knowledge Organization System(STKOS),a journal classification scheme established by The National Scientific and Technical Book and Literature Center,was developed based on the STKOS super vocabulary.The subject indexing system of foreign scientific literature was optimized by filtering the ambiguous concepts according to the domain consistency principles of concepts and literature in combination with the generation of annotated dictionary,canonization of terminology,filtration of general concepts,and selection of concepts.Results The rate of accuracy,recall and F measure of subject indexing in the data set of scientific journals was 77.53%,73.25%and 75.33%respectively.Conclusion Ambiguity resolution can effectively improve the subject indexing of foreign scientific literature.

关 键 词:歧义消解 文献标引 主题标引系统 范畴分类 

分 类 号:G254.21[文化科学—图书馆学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象