基于MapReduce的相似矩阵并行构造  被引量:1

Parallel construction of similar matrix based on MapReduce

在线阅读下载全文

作  者:罗莉霞 蒋盛益 LUO Li-xia;JIANG Sheng-yi(School of Electronic Information,Hunan Institute of Information Technology,Changsha 410151,China;School of Information Science and Technology,Guangdong University of Foreign Studies,Guangzhou 510006,China)

机构地区:[1]湖南信息学院电子信息学院,湖南长沙410151 [2]广东外语外贸大学信息科学与技术学院,广东广州510006

出  处:《计算机工程与设计》2021年第5期1368-1375,共8页Computer Engineering and Design

基  金:国家自然科学基金项目(61572145)。

摘  要:随着用户数量与数据体量的飞速增长,传统基于相似性矩阵构造的协同过滤算法求解效率低下。针对这一问题,提出一种基于MapReduce框架下的并行相似矩阵构造算法。依据基于改进的局部敏感哈希(locality sensitive Hashing,LSH)算法将项目集合划分为不相交的组,基于MapReduce框架进行组内部相似度和组间相似度计算,采用MovieLens数据集进行算例实验对比。实验结果表明,与传统串行和两轮次MapReduce构造方法相比,在相同的实验条件下,所提方法的平均执行时间分别节约26.4%和14.4%以上。所提方法在大规模数据集场景下具有更好的经济性与扩展性,改进的LSH算法有效提升了后续轮次的计算效率。With the rapid growth of the number of users and the volume of data,the traditional collaborative filtering algorithm based on the similarity matrix construction is inefficient.Aiming at this problem,a parallel similarity matrix construction algorithm based on MapReduce framework was proposed.The item set was divided into disjoint groups based on the improved locality sensitive Hashing(LSH)algorithm.The internal similarity and inter-group similarity calculation was performed based on the MapReduce framework.MovieLens data set was used for comparison of experimental examples.The results show that compared with the traditional serial and two-round MapReduce construction method,under the same experimental conditions,the average execution time of the proposed method can save more than 26.4% and 14.4%,respectively.The proposed method has better economy and scalability in large-scale data set scenarios,and the improved LSH algorithm effectively improves the calculation efficiency of subsequent rounds.

关 键 词:相似矩阵构造 相似度计算 MAPREDUCE框架 协同过滤推荐算法 并行计算 局部敏感哈希算法 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象