对数似然相似度算法的MapReduce并行化实现被引量：3

Parallel implementing loglikelihood similarity algorithm based on MapReduce programming model

出　　处：《计算机工程与设计》2015年第5期1233-1238,共6页Computer Engineering and Design

基　　金：国家自然科学基金重点项目(612724420);江苏省973基金项目(BK2011022)

摘　　要：为提高Mahout中协同过滤算法处理大数据的能力,对云计算平台进行研究,提出一种基于MapReduce模型计算相似度的方法。通过设计4个MapReduce任务,实现对数似然相似度算法的并行化;结合算法自身的特点,采用复合键对和同现矩阵的思想将大量小键值对合并为大键值对,以减少中间计算量和通信开销。实验结果表明,和Mahout中的单机版相似度算法相比,基于Hadoop平台的对数似然相似度算法具有很好的加速比和可扩展性,能够提升推荐算法的效率。To improve the ability of CF algorithm in Mahout to deal with massive data,using the cloud computing platform,Ma-pReduce programming model was introduced to compute similarity in parallel.Four submissions of MapReduce were designed to implement the parallelism of loglikelihood similarity algorithm.Considering the characteristics of the algorithm itself,lots of small key-value pairs were merged into big ones by adopting the idea of composite key and the co-occurrence matrix to reduce computational complexity and network bandwidth.The experimental results show that the loglikelihood similarity algorithm based on Hadoop has excellent linear speedup with computing nodes to a certain number and good scalability in terms of big data.

关键词：云计算 MapReduce编程模型协同过滤对数似然相似度同现矩阵并行化

分类号：TP312[自动化与计算机技术—计算机软件与理论]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

对数似然相似度算法的MapReduce并行化实现被引量：3

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

对数似然相似度算法的MapReduce并行化实现 被引量：3

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

对数似然相似度算法的MapReduce并行化实现被引量：3