基于社区极大类发现的大数据并行聚类算法  被引量:6

Large data parallel clustering algorithm based on discovery of maximal class in the community

在线阅读下载全文

作  者:钱晓东[1] 曹阳[1] 

机构地区:[1]兰州交通大学自动化与电气工程学院,甘肃兰州730070

出  处:《南京理工大学学报》2016年第1期117-123,共7页Journal of Nanjing University of Science and Technology

基  金:国家自然科学基金(71461017)

摘  要:为了能在大数据中准确快速地寻找到网络结构,该文提出一种基于社区极大类的大数据聚类算法。对于初始节点不确定和适应度函数计算所带来的时间消耗,引入局部关键节点和对适应度公式进行改进来减少时间消耗。对于初始社区的形成,引入了极大团的概念并通过分析极大团的特性,得出社区的核心类别是由极大团构成,同时提出通过极大团的发现来得到局部核心类别的方法并提出了极大团发现算法的并行策略,然后提出整个算法的并行策略并在真实数据集上实验。实验结果证明该文提出的算法是可行和有效的,适用于大规模数据的网络结构发现。In order to find the network structure in the big data accurately and quickly, a large data clustering algorithm based on community clustering is proposed here. The key local node and improved fitness function are introduced to reduce the time consumption caused by the initial node's uncertainty and the fitness function computing. For the formation of the initial community, this paper introduces the conception of the maximum clique. The conclusion that the core category of the community is made up of the maximum clique is drawn through analyzing its properties. This paper proposes the way of getting a local core class through finding the maximum clique. This paper proposes a parallel strategy of the maximum clique discovery algorithm and tests it in the real data sets. The experimental results show this algorithm is feasible and effective which can be applied to finding the network structure of large-scale data.

关 键 词:大数据 聚类 复杂网络 局部关键节点 核心类别 极大团 适应度 并行算法 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象