一种基于局部拓展的并行重叠社区发现算法

Parallelizable Overlapping Community Detection Algorithm Based on Local Expansion

出　　处：《计算机科学》2016年第9期61-65,共5页Computer Science

基　　金：国家自然科学基金项目(61271374)资助

摘　　要：处理海量级数据的有效途径之一是将算法分解为一系列互不依赖的任务,然后利用开源工具并行地执行算法。而在重叠社区发现算法中,基于局部拓展的方法在拓展阶段往往仅需要局部社区及其相应的邻居结点的信息,因而具备可并行执行的可能性。提出了一种可并行化执行的局部拓展算法,并借助开源工具Spark将其实现。算法分为4个阶段。首先,挑选出一组不相关的中心结点并使用其对应的局部网络作为种子;其次,通过删除本身连接不是很紧密的局部网络来过滤选出的种子;然后,采用一种批量式的拓展策略来拓展种子,即一次向局部社区中添加一批邻居结点或从社区中删除一批结点;最后,融合相似的社区。在人工生成的网络以及真实世界中的网络上的实验结果显示,所提算法既准确又高效。An effective way to deal with massive datasets is to decompose an algorithm into a series of irrelevant tasks, and then to execute them in parallel by using open source softwares. Among overlapping community detection algo- rithms, the methods based on local expansion in its expansion phase only need the information of local communities and their corresponding neighbors, thus they have the possibility to be executed in parallel. In this paper, we proposed a pa- rallelizable algorithm utilizing local expansion for overlapping community detection, and implemented it by using open source software Spark. The algorithm consists of four phases. Firstly,a group of irrelevant central vertices are selected and their corresponding local networks are used as seeds. Secondly, the algorithm filters the selected seeds by removing those whose vertices are weakly connected. Thirdly, the algorithm adopts a batch expansion strategy to expand seeds, by adding a group of neighboring vertices into the local community or removing a group of vertices from the local commu- nity. Finally, similar communities are merged. Experimental results based on artificial networks and real world networks show that our method is both accurate and efficient

关键词：复杂网络重叠社区发现局部拓展并行化算法 SPARK

分类号：TP181[自动化与计算机技术—控制理论与控制工程]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

一种基于局部拓展的并行重叠社区发现算法

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

一种基于局部拓展的并行重叠社区发现算法

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索