SOOP: Efficient Distributed Graph Computation Supporting Second-Order Random Walks  

在线阅读下载全文

作  者:Songjie Niu Dongyan Zhou 

机构地区:[1]State Key Laboratory of Computer Architecture,Institute of Computing Technology,Chinese Academy of Sciences Beijing 100190,China [2]University of Chinese Academy of Sciences,Beijing 100049,China [3]Bytedance Technology,Beijing 100086,China

出  处:《Journal of Computer Science & Technology》2021年第5期985-1001,共17页计算机科学技术学报(英文版)

摘  要:The second-order random walk has recently been shown to effectively improve the accuracy in graph analysis tasks.Existing work mainly focuses on centralized second-order random walk(SOW)algorithms.SOW algorithms rely on edge-to-edge transition probabilities to generate next random steps.However,it is prohibitively costly to store all the probabilities for large-scale graphs,and restricting the number of probabilities to consider can negatively impact the accuracy of graph analysis tasks.In this paper,we propose and study an alternative approach,SOOP(second-order random walks with on-demand probability computation),that avoids the space overhead by computing the edge-to-edge transition probabilities on demand during the random walk.However,the same probabilities may be computed multiple times when the same edge appears multiple times in SOW,incurring extra cost for redundant computation and communication.We propose two optimization techniques that reduce the complexity of computing edge-to-edge transition probabilities to generate next random steps,and reduce the cost of communicating out-neighbors for the probability computation,respectively.Our experiments on real-world and synthetic graphs show that SOOP achieves orders of magnitude better performance than baseline precompute solutions,and it can efficiently computes SOW algorithms on billion-scale graphs.

关 键 词:second-order random walk(SOW) Node2Vec second-order PageRank distributed graph computation SOOP(second-order random walks with on-demand probability computation) 

分 类 号:TP31[自动化与计算机技术—计算机软件与理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象