基于词表的分类向量空间模型在中文科技文献相关性判定中的应用研究  被引量:2

Application Research on Constructing a Vector Space Model of Classification based on Thesaurus for the Judgment of Relevance of Chinese Literatures

在线阅读下载全文

作  者:李秦 杨文建[2] 谭琳 

机构地区:[1]重庆建筑工程职业学院 [2]重庆第二师范学院 [3]重庆维普资讯有限公司

出  处:《图书馆杂志》2016年第12期32-40,共9页Library Journal

摘  要:探讨相关文献三种实现机制的特点,构建更有效的中文科技文献相关性数据库。借鉴完全内容特征算法,基于词表的分类向量空间模型进行预处理相关文献,并以冶金工业领域为例构建中文科技文献相关性数据库。通过结合系统判定和人工判定结果的对比分析、系统和系统之间的判定结果的对比,分析了基于词表的分类向量空间模型的相关性判定效果,结果表明其具有较高的准确率。基于完全内容特征算法判定相关文献有利于完善知识发现系统功能,提高知识服务水平。In this paper, the characteristics of three mechanisms are explored to build a more effective database of Chinese literature of science and technology. Drawing upon complete content algorithm, and using Vector Space Model(VSM) based on thesaurus, the relevant literature is preprocessed. Then taking the metallurgy industry as an example, it constructs database of Chinese literature of science and technology. By comparing the system judgment and artificial judgment of relevance, the system judgment and the other two systems'.judgment of relevance, the VSM of classification based on thesaurus is evaluated to have high accuracy. The judgment of related articles based on complete content feature algorithm is conducive to improving the function of knowledge discovery system and to improving the knowledge service level.

关 键 词:分类向量空间模型 分类-SIM 词表分词 相关性判定 相关文献 

分 类 号:G353.1[文化科学—情报学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象