检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
机构地区:[1]清华大学计算机科学与技术系,北京100084 [2]清华大学图书馆,北京100084 [3]清华大学信息技术研究院,北京100084
出 处:《清华大学学报(自然科学版)》2009年第4期590-594,共5页Journal of Tsinghua University(Science and Technology)
基 金:国家自然科学基金资助项目(60473078);国家"八六三"高技术项目(2006AA010101);国家"十一五"科技支撑计划资助项目(2006BAH02A12)
摘 要:针对协作过滤算法评测中普遍采用单一数据集,该文将传统的User-based(近邻数为20)、Item-based、Itemaverage、Item user average和Slope One 5种算法应用于MovieLens和Book-Crossing两种数据分布特征不同的数据集。结果显示,在Movielens这种评分值相对比较稠密的数据集上,Slope One算法的预测精度最好;而在评分值相对比较稀疏的Book-Crossing数据集上,Item-based算法的预测精度最好,Slope One的预测精度最差。选择算法应根据用户和资源分布具体情况确定。Most collaborative filtering (CF) research has focused on doing experiments on single dataset or datasets with the same characteristics. This paper presents an analysis of several typical CF algorithms, the User-based KNN method (with 20 neighborhoods), the item-based method, the item average method, the item user average method, and the Slope One method. These algorithms are evaluated on two types of datasets, Movielens and Book-Crossing, which have different user-item distribution characteristics. The results show for the relatively dense ratings on the Movielens dataset, the Slope One method has the best prediction precision, while on datasets with relatively sparse ratings such Book-Crossing, the item-based method is the best, while the Slope One method is the worst. Thus, the different CF algorithms give different results on the different datasets, so the CF algorithm should be designed according to the user item distribution characters.
分 类 号:TP391.4[自动化与计算机技术—计算机应用技术] TP311.13[自动化与计算机技术—计算机科学与技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.229