基于三阶张量的大规模数据谱聚类集成算法  被引量:1

Spectral clustering ensemble algorithm based on three-order tensor for large-scale data

在线阅读下载全文

作  者:仵匀政 杜韬[1,2] 周劲 陈迪[1] 王心耕 WU Yunzheng;DU Tao;ZHOU Jin;CHEN Di;WANG Xingeng(College of Information Science and Engineering,University of Jinan,Jinan 250024,China;Shandong Provincial Key Laboratory of Network Based Intelligent Computing,Jinan 250024,China)

机构地区:[1]济南大学信息科学与工程学院,山东济南250024 [2]山东省网络环境智能计算技术重点实验室,山东济南250024

出  处:《大数据》2024年第3期133-148,共16页Big Data Research

基  金:国家自然科学基金项目(No.62273164,No.61873324);山东省自然科学基金项目(No.ZR2019MF040)。

摘  要:为了降低大规模数据谱聚类计算负担,进一步提高聚类的准确性和鲁棒性,提出了一种基于三阶张量的大规模数据谱聚类集成算法。首先,提出一种混合代表最近邻近似方法构造数据间的稀疏亲和子矩阵;然后将稀疏亲和子矩阵表示为二部图,通过图分割的方法得到初步聚类结果;最后,提出三阶张量集成方法,将多个聚类结果进行融合,得到最终的聚类结果。在大规模的真实数据集和合成数据集上验证,相较经典的谱聚类算法、聚类集成算法以及近年来对其改进的算法,该算法表现出更优异的性能。In order to reduce the computational burden of large-scale data spectral clustering and further improve the clustering accuracy and robustness,the spectral clustering ensemble algorithm based on the three-order tensor for large-scale data was proposed.The sparse affinity sub-matrix was first constructed by the mixed representative nearest neighbor approximation method.The sparse affinity sub-matrix was then represented as a bipartite graph.The preliminary clustering results were obtained by Graph Segmentation.Finally,an unified clustering result was obtained by fusing multiple clustering results through the three-order tensor ensemble method.On the real datasets and the synthetic datasets,the proposed algorithm showed a better performance compared to the classical spectral clustering algorithm,the clustering ensemble algorithm,and the improved algorithms in recent years.

关 键 词:数据聚类 大规模数据 谱聚类 三阶张量 聚类集成 

分 类 号:TP301[自动化与计算机技术—计算机系统结构]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象