基于显隐式信息融合和单类协同过滤方法的主题词推荐  被引量:3

Subject Term Recommendation Based on the Fusion of Explicit & Implicit Information and One-class Collaborative Filtering

在线阅读下载全文

作  者:李树青[1] 黄金旺 马丹丹 张志旺 Li Shuqing;Huang Jinwang;Ma Dandan;Zhang Zhiwang(School of Information Engineering,Nanjing University of Finance and Economics,Nanjing 210023)

机构地区:[1]南京财经大学信息工程学院,南京210023

出  处:《图书情报工作》2023年第3期72-84,共13页Library and Information Service

基  金:国家社会科学基金项目“学术虚拟社区知识交流效率研究”(项目编号:17BTQ028)研究成果之一。

摘  要:[目的/意义]提出一种基于融合显隐式信息的单类协同过滤算法的文献主题词推荐方法,以提高面向学者和文献的主题词推荐的准确率.[方法/过程]通过构造一种基于文献丰富度和主题词流行度的矩阵分解模型,测度出文献和未出现在当前文献中的主题词相关性概率,并根据相关性概率的大小将这些主题词划分为文献的隐式相关主题词和隐式无关主题词.然后针对这两种主题词,分别提出两种不同的主题词权值预测方法,即融合偏好系数的自编码器填充模型和零值填充模型.[结果/结论]在面向人工智能领域的科技文献数据集SD4AI上的实验表明,较各种其他典型协同过滤方法,本文方法可分别提高预测主题词权值和识别高权值主题词的推荐效果,MAE和FCP的提升幅度最高达16.07%和16.83%,P@N和NDCG@N的推荐效果最高达22.37%和27.06%.[Purpose/Significance]The proposed one-class collaborative filtering algorithm with the fusion of explicit and implicit information has a remarkable effect in the field of literature subject term recommendation,and improves the precision of subject term recommendation for scholar and literature.[Method/Process]By constructing a matrix decomposition model based on literature richness and subject term popularity,the correlation probability of literature and subject terms that do not appear in the current literature was measured,and these subject terms could be divided into implicit related subject terms and implicit unrelated subject terms of literature according to the correlation probability.For these two kinds of subject terms,two different weight prediction methods of subject terms were proposed,namely,AutoRec Filling with Preference Coefficient and Zero Filling.[Result/Conclusion]The experiment on SD4AI,a scientific and technological literature dataset oriented to the field of artificial intelligence,shows that compared with various typical collaborative filtering methods,MAE and FCP have respectively improved the recommendation effect of predicting the weight of subject terms and identifying high weight subject terms,with the maximum increase of 16.07%and 16.83%,while the maximum value of P@N and NDCG@N is 22.37%and 27.06%respectively.

关 键 词:主题词推荐 扩展主题词 单类协同过滤 词项相关性 词项权值 

分 类 号:G250[文化科学—图书馆学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象