检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:巨瑜芳[1] 雷小锋[1] 戴斌[1] 庄伟[1] 宋丰泰[1]
机构地区:[1]中国矿业大学计算机科学与技术学院,江苏徐州221116
出 处:《计算机应用研究》2012年第8期2837-2840,共4页Application Research of Computers
基 金:江苏省基础研究计划资助项目(BK2009093);中国矿业大学科技基金资助项目(OD080313)
摘 要:聚类是假设数据在具有某种群聚结构的前提下根据观察到的无标记的样本发现数据的最优划分。针对已有的聚类算法存在的缺点,假设数据样本的结果簇是密集的,且簇与簇之间区别明显,基于该假设提出一种基于傅里叶变换和连通图的聚类分析方法 FGClus。首先针对每个样本点计算k阶距离矩阵并序列化作为离散傅里叶变换的输入信号;然后抽取频域内幅值最小的复数项并构造输入序列进行傅里叶逆变换,得到在时域空间中的最佳阈值;最后利用该阈值结合连通图指导最终的聚类过程。实验表明,FGClus算法克服了K-means算法聚类前需确定聚类个数、聚类结果对初始代表点的选取敏感、只能聚类球状数据等缺点,取得了良好的聚类效果。Clustering is to find the best partition of unlabeled observations under a certain group structure hypothesis. For the shortcomings in the existing clustering algorithms, this paper assumed that the results of the data sample was intensive and the differences among every cluster were significant. Based on the assumption it presented a cluster analysis method called FGClus based on discrete Fourier transform and graph theory. First, this method calculatd k-distance matrix of each sample point as a sequence of the input signal of discrete Fourier transform, then extracted the minimum amplitude of the complex frequency do- main items and constructed the input sequence of inverse Fourier transform, to get the optimal threshold value of the space in the time domain. Finally, it used threshold and connected graph to guide the final clustering process. Large numbers of experiments show that FGClus algorithm can overcome existed shortcomings of K-means algorithm, such as the number of clusters must be determined before clustering, the results is sensitive on initial selection of representative points and it just can cluster spherical datas, which achieves good clustering results.
关 键 词:聚类分析 离散傅里叶变换 连通图 最短路径K近邻查询 最佳阈值
分 类 号:TP181[自动化与计算机技术—控制理论与控制工程]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.38