融合选择数据偏差消除和条件生成对抗网络的显式评分填充策略  被引量:4

Explicit Rating Filling Strategy Based on Selection Data Bias Elimination and Conditional Generative Adversarial Networks

在线阅读下载全文

作  者:石磊 李树青[1] 蒋明锋 张志旺 王愈 Shi Lei;Li Shuqing;Jiang Mingfeng;Zhang Zhiwang;Wang Yu(College of Information Engineering,Nanjing University of Finance&Economics,Nanjing 210023,China)

机构地区:[1]南京财经大学信息工程学院,南京210023

出  处:《数据分析与知识发现》2023年第6期1-14,共14页Data Analysis and Knowledge Discovery

基  金:江苏省高等学校自然科学研究重大项目(项目编号:19KJA510011);国家自然科学基金项目(项目编号:61877061)的研究成果之一。

摘  要:【目的】为缓解推荐系统中显式评分数据广泛存在的数据稀疏性和用户选择偏差问题,提出一种基于无趣项注入的评分数据填充模型。【方法】基于条件生成对抗网络框架构建通用的评分数据填充模型,使用去噪自编码器作为生成器以捕捉交互背后的非线性潜在因素并提高模型的鲁棒性。针对选择偏差问题,基于用户时点可见性挖掘无趣项,并通过修改掩膜机制注入模型中生成符合用户真实评分分布的数据。【结果】在MovieLens和Amazon CD数据集上的实验结果表明,经过数据填充后,ItemCF、BiasSVD和AutoRec算法的推荐精度平均提升了3倍以上。【局限】数据生成依赖于评分数据,无法有效应用于评分数据极度稀疏的冷启动场景。【结论】所提模型能够有效缓解数据稀疏性并消除选择偏差,显著提高现有协同过滤方法在推荐任务中的性能。[Objective]This study is to address the issues of data sparsity and user selection bias in explicit rating data in recommender systems,by proposing a rating data filling model based on uninteresting item injection.[Methods]A general rating data filling model is constructed based on Conditional Generative Adversarial Networks framework.Denoising Auto-Encoder is used as the generator to capture the nonlinear potential factors behind the interaction and improve the robustness of model.To address the selection bias problem,uninteresting items are identified based on the user’s time point visibility,and are injected into the model by modifying the mask operation to generate data consistent with the user’s real rating distribution.[Results]Our experiments on MovieLens and Amazon datasets show that after data filling,the recommendation accuracy of ItemCF,BiasSVD,and AutoRec improves by more than three times on average.[Limitations]The data generation method relies on rating data and may not be effective in the case of extremely sparse rating data,such as in cold start scenarios.[Conclusions]The proposed model effectively alleviates data sparsity and eliminates selection bias,significantly improving the performance of recommended tasks of existing collaborative filtering methods.

关 键 词:数据稀疏 选择偏差 生成对抗网络 无趣项 数据填充 

分 类 号:TP393[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象