一种改进的DBSCAN算法在Spark平台上的应用被引量：7

Application of Improved DBSCAN Algorithm on Spark Platform

作　　者：邓定胜 DENG Ding-sheng(School of Science and Technology,Sichuan Minzu College,Kangding,Sichuan 626001,China)

出　　处：《计算机科学》2020年第S02期425-429,443,共6页Computer Science

基　　金：四川民族学院自然科学重点项目(XYZB19001ZA);四川省教育厅自然科学重点项目(17ZA0295);四川民族学院2017年应用型示范课程项目(sfkc201705);国家自然科学基金项目(11461058)。

摘　　要：针对DBSCAN(Density-Based Spatial Clustering of Applications with Noise)聚类算法内存占用率较高的问题,文中将改进的DBSCAN聚类算法与Spark平台并行聚类计算理论相结合,对海量数据采用分而治之的办法进行聚类处理,大幅减小了算法对内存的占用率。实验仿真结果表明,所提出的并行计算方法能够有效缓解内存不足的问题,并且该方法也能够用来评价DBSCAN聚类算法在Hadoop平台下的聚类分析效果,还能对两种聚类方法进行对比分析,从而获得较好的计算性能;且其比在Hadoop平台上的计算加速度提高了24%左右,因此可以用以评价DBSCAN聚类算法在聚类处理方面的优劣。Aiming at the problem of high memory occupancy of DBSCAN(Density-Based Spatial Clustering of Applications with Noise)clustering algorithm,this paper combines the improved DBSCAN clustering algorithm with the parallel clustering calculation theory of Spark platform,and the clustering and processing methods for massive data are clustered,which greatly reduces the memory usage of the algorithm.The experimental simulation results show that the proposed parallel computing method can effectively reduce the shortage of memory,and it also can be used to evaluate the clustering effect of the DBSCAN clustering algorithm on the Hadoop platform,and compare and analyze the two clustering methods to obtain better computing performance.Besides,the acceleration is increased by about 24%compared with that on the Hadoop platform.The proposed method can be used to evaluate the pros and cons of the DBSCAN clustering algorithm in clustering.

关键词：并行计算 DBSCAN 聚类算法 SPARK 聚类加速比

分类号：TP391[自动化与计算机技术—计算机应用技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

一种改进的DBSCAN算法在Spark平台上的应用被引量：7

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

一种改进的DBSCAN算法在Spark平台上的应用 被引量：7

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

一种改进的DBSCAN算法在Spark平台上的应用被引量：7