检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:王大阜 邓志文[1] 贾志勇[1] 王静[1] Wang Dafu;Deng Zhiwen;Jia Zhiyong;Wang Jing(Library,China University of Mining and Technology,Xuzhou 221116,China)
出 处:《河南师范大学学报(自然科学版)》2023年第4期34-42,共9页Journal of Henan Normal University(Natural Science Edition)
基 金:江苏省高校哲学社会科学研究项目(2022SJYB1129);国家社科基金(22BTQ023).
摘 要:为解决海量的电子资源给读者带来“信息过载”的困扰,采用基于内容的推荐算法为读者推荐内容适配、质量优良的学术论文.考虑论文文本的上下文语义、词序及全局主题信息,首先采用Doc2Vec和LDA(Latent Dirichlet Allocation)混合语义模型训练候选论文集摘要语料库,学习得到每篇论文的文本向量,其次利用K-Means算法对候选论文集进行聚类,然后探寻目标论文所属簇的类群成员作为待推荐论文,最后融合文献质量权重进行相似度计算并排序,从而得到TOP-N近邻推荐结果.以CNKI图书情报类期刊论文作为语料库,通过实证分析,采用的混合模型与传统的TF-IDF(Term Frequency-Inverse Document Frequency)、Word2Vec、LDA 3种模型相比,推荐结果的精确率较高、排序差异度低,达到良好的推荐效果.The content-based recommendation algorithm is used to recommend academic papers with adaptive content and high quality for readers,so as to solve the problems of"information overload"caused by massive electronic resources.The context,word order&global topic information of the thesis text are taken into consideration.Firstly,Doc2Vec and LDA hybrid semantic model are used to train the summary corpus of candidate thesis sets,and the text vector of each thesis is learned.Then,the candidate thesis sets are clustered by K-means algorithm,and then the cluster members of the target papers are searched as the papers to be recommended,Finally,the similarity is calculated and sorted by fusing the literature quality weight,so as to obtain the TOP-N nearest neighbor recommendation results.Taking CNKI library&information journal pap as the corpus,an empirical analysis is conducted.Word2Vec&LDA models,the hybrid model adopted in this paper,compared with the traditional TF-IDF,the hybrid model adopted in this paper has higher accuracy and lower ranking difference,and achieves good recommendation results.
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.3