检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:王末 郑晓欢[3] 王卷乐[4,5,6] 柏永青 WANG Mo;ZHENG Xiaohuan;WANG Juanle;BAI Yongqing(Agricultural Information Institute of Chinese Academy of Agricultural Sciences, Beijing 100081, China;Key Laboratory of Agricultural Big Data, Ministry of Agriculture, Beijing 100081, China;Office of General Affairs, Chinese Academy of Sciences, Beijing 100864, China;State Key Laboratory of Resources and Environmental Information System, Institute of Geographic Sciences and Natural Resources Research, CAS, Beijing 100101, China;University of Chinese Academy of Sciences, Beijing 100049, China;Jiangsu Center for Collaborative Innovation in Geographical Information Resource Development and Application, Nanjing 210023, China)
机构地区:[1]中国农业科学院农业信息研究所,北京100081 [2]农业部农业大数据重点实验室,北京100081 [3]中国科学院办公厅,北京100864 [4]中国科学院地理科学与资源研究所,资源与环境信息系统国家重点实验室,北京100101 [5]中国科学院大学,北京100049 [6]江苏省地理信息资源开发与利用协同创新中心,南京210023
出 处:《地理研究》2018年第4期814-824,共11页Geographical Research
基 金:国家科技基础条件平台建设项目(2005DKA32300);中国科学院特色研究所培育建设服务项目(TSYJS03);中国工程科技知识中心建设项目(CKCEST-2017-3-1);农业科学数据挖掘分析平台研究与建设项目(JBYW-AII-2017-32);中国农业科学院科技创新工程项目(CAAS-ASTIP-2016-AII)
摘 要:推荐系统是帮助互联网用户克服信息过剩的有效工具。在地学数据共享领域,较其他物品的内容属性,地学数据具有更加丰富的时空属性,这也给地学数据推荐带来挑战。针对地学数据的特点,为地学数据共享推荐服务开发了一种动态加权的混合过滤方法。该方法分别采用协同过滤和基于内容过滤算法预测用户对数据的兴趣度,再以训练模型计算最优加权权重,计算最终预测评分。在数据获取阶段,通过用户访问日志数据,采用Jenks Natural Break算法分析用户访问记录获取用户的数据兴趣度。在基于内容过滤部分,通过数据的空间、时间及内容属性计算数据相似度,并以用户历史行为为依据计算用户兴趣。在协同过滤和基于内容过滤中分别采用k-NN算法计算用户对未访问数据的预测评分,并进行加权求和。通过训练集,对理想权重值及用户的共同评价度(co-rating level)进行建模,拟合二者的关系。该模型被应用于混合过滤的权重调整,以获得最优的加权方程。测试结果显示,结合数据时空属性的混合过滤方法的准确度和召回率,较单一的协同过滤或基于内容过滤方法有显著提高。Recommender systems are effective tools helping Internet users mitigate informa- tion overloading. In geoscience data sharing domain, items (datasets) are more informative in terms of spatial and temporal attributes compared to regular item (e.g. books, movies, music). Thus, high-performance recommendation algorithms for geoscience data are more challenging. This study proposed an approach that combines content-based filtering with item-based collab- orative filtering using dynamic weights. The approach examines merits of both collaborative fil- tering in its predictive ability and item content information to mitigating data sparsity and early ratter problem. Users' ratings on items were first derived with their historical visiting time by Jenks Natural Breaks. In the CBF part, spatial, temporal, and thematic information of geosci- ence datasets were extracted to compute item similarity. Predicted ratings were computed with k-NN method separately using CBF and CF, and then combined with dynamic weights. With training dataset, we attempted to find the best model describing ideal weights and users' co-rat- ing level. A logarithmic function was identified to be the best model. The model was then ap- plied to tune the weights of CF and CBF on user-item basis with test dataset. Evaluation results showed that the dynamic weighted approach outperformed either solo CF or CBF approach in terms of Precision and Recall.
分 类 号:TP391.3[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:52.15.207.126