基于类间相似性的聚类集成方法  

Clustering Ensemble Method Based on Similarity Between Clusters

在线阅读下载全文

作  者:张栋超 蔡江辉[1] 杨海峰[1] 郑爱宇 ZHANG Dong-chao;CAI Jiang-hui;YANG Hai-feng;ZHENG Ai-yu(School of Computer Science and Technology,Taiyuan University of Science and Technology,Taiyuan 030024,China)

机构地区:[1]太原科技大学计算机科学与技术学院,山西太原030024

出  处:《计算机技术与发展》2023年第11期156-161,共6页Computer Technology and Development

基  金:国家自然科学基金项目(U1931209)。

摘  要:聚类集成是聚类的一个重要分支,它用于融合多个基聚类,来生成具有鲁棒性和高质量的最终聚类划分。将原始信息转化为共协矩阵,通过共协矩阵得到最终聚类划分的聚类集成方法是目前很多研究者研究的内容,然而大多数研究者都忽略了聚类结果容易受到噪声的影响,且忽略了共协矩阵在数据量大时,时间以及空间复杂度高的问题。为了解决以上问题,该文设计了一种基于类间相似性的聚类集成方法(CSCE)。该方法首先基于证据积累模型找到原始对象之间的相似性,将原始对象划分为多个小簇。然后通过一种新的相似度计算方法,计算簇与簇之间的相似度,形成簇与簇的相似矩阵。最后通过归一化切割(NCUT)切图的方法,将簇相似矩阵划分为最终聚类结果。该方法将低质量异常对象按相似度并入与之相似的簇中,并在8个数据集上进行了实验。结果表明,该方法不仅聚类效果好,而且解决了传统共协矩阵时间以及空间复杂度高的问题。Clustering ensemble is an important branch of clustering,which is used to fuse multiple base clusters to generate robust and high-quality final clustering partitions.At present,many researchers focus on the clustering ensemble method of transforming the original information into a co-association matrix to obtain the final clustering partition through the co-association matrix.However,most researchers ignore that the clustering results are easily affected by noise,and the time and space complexity of the co-association matrix is high when the amount of data is large.In order to solve the above problems,we design a clustering ensemble method based on similarity between clusters(CSCE).The method firstly finds the similarity between the original objects based on the evidence accumulation model,and divides the original objects into several small clusters.Then a new similarity calculation method is used to calculate the similarity between clusters and form the similarity matrix between clusters.Finally,the cluster similarity matrix is divided into the final clustering results by the method of normalized cut(NCUT).The proposed method combines low quality abnormal objects into similar clusters according to similarity,and experiments are conducted on 8 datasets.It is showed that the proposed method not only has a good clustering effect,but also solves the problem of high time and space complexity of traditional co-association matrix.

关 键 词:聚类集成 共协矩阵 基聚类 证据积累 复杂度 

分 类 号:TP301.6[自动化与计算机技术—计算机系统结构]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象