一种大规模高维数据集的高效聚类算法被引量：2

An Efficient Clustering Algorithm of Large Scale and High Dimensional Data Set

出　　处：《应用科学学报》2006年第4期396-400,共5页Journal of Applied Sciences

基　　金：国家自然科学基金(70371015);教育部高等学校博士学科点专项科研基金(20040286009)资助项目

摘　　要：大规模高维数据集的聚类算法已成为当前聚类研究的热点,由于高维的原因,聚类往往隐藏在数据空间的某些子空间中,传统的聚类算法无法获得有意义的聚类结果.此外,高维数据中含有的大量的随机噪声也会带来额外的效率问题.为了解决以上问题,该文在CLIQUE算法的基础上提出了一种基于最优区间分割和数据集划分的聚类算法—OpCluster,并使用仿真数据对该算法加以验证,实验结果表明,OpCluster对大规模高维数据集具有很好的聚类效果.Clustering large data set of high dimensionality has always been a serious challenge for clustering algorithms. Traditional clustering algorithms often fail to detect meaningful clusters because of the high dimensionality and inherently sparse feature space of most real-world data sets. Nevertheless, the data sets often contain clusters hidden in various subspaces of the original feature space. In addition, high-dimensional data often contain a significant amount of noise which causes additional effectiveness problems. To overcome these problems, a new algorithm based on CLIQUE, named OpCluster, is proposed. A set of experiments on a synthetic dataset demonstrate the effectiveness and efficiency of the new approach.

关键词：聚类算法子空间聚类最优分割数据划分

分类号：TP311[自动化与计算机技术—计算机软件与理论]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

一种大规模高维数据集的高效聚类算法被引量：2

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

一种大规模高维数据集的高效聚类算法 被引量：2

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

一种大规模高维数据集的高效聚类算法被引量：2