检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:石磊 李树青[1] 蒋明锋 张志旺 王愈 Shi Lei;Li Shuqing;Jiang Mingfeng;Zhang Zhiwang;Wang Yu(College of Information Engineering,Nanjing University of Finance&Economics,Nanjing 210023,China)
出 处:《数据分析与知识发现》2023年第6期1-14,共14页Data Analysis and Knowledge Discovery
基 金:江苏省高等学校自然科学研究重大项目(项目编号:19KJA510011);国家自然科学基金项目(项目编号:61877061)的研究成果之一。
摘 要:【目的】为缓解推荐系统中显式评分数据广泛存在的数据稀疏性和用户选择偏差问题,提出一种基于无趣项注入的评分数据填充模型。【方法】基于条件生成对抗网络框架构建通用的评分数据填充模型,使用去噪自编码器作为生成器以捕捉交互背后的非线性潜在因素并提高模型的鲁棒性。针对选择偏差问题,基于用户时点可见性挖掘无趣项,并通过修改掩膜机制注入模型中生成符合用户真实评分分布的数据。【结果】在MovieLens和Amazon CD数据集上的实验结果表明,经过数据填充后,ItemCF、BiasSVD和AutoRec算法的推荐精度平均提升了3倍以上。【局限】数据生成依赖于评分数据,无法有效应用于评分数据极度稀疏的冷启动场景。【结论】所提模型能够有效缓解数据稀疏性并消除选择偏差,显著提高现有协同过滤方法在推荐任务中的性能。[Objective]This study is to address the issues of data sparsity and user selection bias in explicit rating data in recommender systems,by proposing a rating data filling model based on uninteresting item injection.[Methods]A general rating data filling model is constructed based on Conditional Generative Adversarial Networks framework.Denoising Auto-Encoder is used as the generator to capture the nonlinear potential factors behind the interaction and improve the robustness of model.To address the selection bias problem,uninteresting items are identified based on the user’s time point visibility,and are injected into the model by modifying the mask operation to generate data consistent with the user’s real rating distribution.[Results]Our experiments on MovieLens and Amazon datasets show that after data filling,the recommendation accuracy of ItemCF,BiasSVD,and AutoRec improves by more than three times on average.[Limitations]The data generation method relies on rating data and may not be effective in the case of extremely sparse rating data,such as in cold start scenarios.[Conclusions]The proposed model effectively alleviates data sparsity and eliminates selection bias,significantly improving the performance of recommended tasks of existing collaborative filtering methods.
关 键 词:数据稀疏 选择偏差 生成对抗网络 无趣项 数据填充
分 类 号:TP393[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.141.33.133