一种改进的CLIQUE算法及其并行化实现  被引量:3

Improved CLIQUE Algorithm and its Parallelization

在线阅读下载全文

作  者:林鹏 陈曦[1,2] 龙鹏飞[1,2] 傅明[1,2] LIN Peng;CHEN Xi;LONG Peng-fei;FU Ming(Hunan Provincial Key Laboratory of Intelligent Processing of Big Data on Transportation,Changsha University of Science and Technology, Changsha,Hunan 410114, China;School of Computer and Communication Engineering, Changsha University of Science and Technology,Changsha,Hunan 410114, China)

机构地区:[1]长沙理工大学综合交通运输大数据智能处理湖南省重点实验室,湖南长沙410114 [2]长沙理工大学计算机与通信工程学院,湖南长沙410114

出  处:《计算技术与自动化》2018年第4期49-54,共6页Computing Technology and Automation

基  金:国家自然科学基金资助项目(61772087);长沙理工大学研究生科研创新项目(CX2017SS20)

摘  要:CLIQUE算法是一种高效的聚类算法,但其聚类结果存在锯齿边界的问题。而且随着数据规模和维度的增加,算法的效率受到极大影响。针对这些问题,提出一种改进的CLIQUE算法,算法首先使用边界修正方法和滑动网格方法,对稠密区域的边界和稀疏区域进行扫描,寻回被剪枝的稠密网格,提升网格划分的质量;然后实现了改进算法在MapReduce下的分布式并行化,并通过实验验证了算法的性能。实验结果表明,改进后的并行算法的聚类准确率提高了17%~26%,同时有效地减少了处理海量数据的运行时间,具有良好的扩展性。CLIQUE is an efficient algorithm.But its clustering result is defective with the serrated boundary.And with the increase of data size and dimension,the efficiency of the algorithm has been greatly affected.This paper proposes an improved CLIQUE algorithm.The algorithm firstly uses the boundary-correcting method and grid-sliding method to improve the quality of meshing by Scanning the dense area border and sparse area and then retrieving the pruned dense grid.Then the parallelization of the improved algorithm is achieved on top of MapReduce.A series of experiments are carried out and the clustering accuracy,processing time,speedup and scalability of the improved algorithm are tested.The result of experiments proves that the algorithm is improved17%to26%in accuracy.The parallel algorithm decreases the runtime effectively in massive data processing,which shows excellent attribute in scalability.

关 键 词:边界修正方法 滑动网格方法 CLIQUE算法 MAPREDUCE 

分 类 号:TP311.1[自动化与计算机技术—计算机软件与理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象