检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
机构地区:[1]上饶师范学院数学与计算机系,江西上饶334000 [2]兰州大学信息与工程学院,兰州730000
出 处:《计算机应用》2007年第12期3042-3044,共3页journal of Computer Applications
基 金:甘肃省自然科学基金资助项目(3ZS051-A25-035);上饶师范学院基金资助项目
摘 要:K-均值聚类算法的执行时间过度依赖于初始点的选取,但是在实际问题中并不知道k的取值和怎样才能有效地选取初始点。在对K-均值算法中初始点的选取进行深入研究的基础上,提出了一种有效的初始点选取算法。现存的类间相似度并不能很好地度量两个类的相似性,为此提出了一种新颖的度量方法:类间影响因子,使用类间影响因子对类进行合并。该方法和上面提出的初始点选取算法能够根据数据集本身的特性快速地自动选取初始中心并给出初始点的个数。最后用高斯数据集对算法进行测试,得到了一个令人满意的结果。The nmning time of K-means overly depends on the initial points but the fight value of k is unknown and selecting the initial points effectively is also difficult. To solve this problem, depending on the research about initialization deeply, a high effective approach used to select the initial points was presented, which ensured at least one point to be selected in each cluster. Influence factor between clusters was presented to measure the similarity of two clusters, and a new merging algorithm based on it was put forward. This algorithm and the initial points' selection algorithm can automatically and fast give the actual value of k and select the right initial points based on the dataset characters. Finally, Gaussian datasets were used to test the algorithm and a satisfying result was obtained.
分 类 号:TP301[自动化与计算机技术—计算机系统结构]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.249