基于多元数据的谱聚类算法改进与聚类个数确定  被引量:7

Improvement of Multivariate Data-based Spectral Clustering Algorithm and Determination of the Number of Clusters

在线阅读下载全文

作  者:王丙参[1] 魏艳华[1,2] 张贝贝 Wang Bingcan;Wei Yanhua;Zhang Beibei(School of Mathematics and Statistics,Tianshui Normal University,Tianshui Gansu 741001,China;School of Statistics,Capital University of Economics and Business,Beijing 100070,China)

机构地区:[1]天水师范学院数学与统计学院,甘肃天水741001 [2]首都经济贸易大学统计学院,北京100070

出  处:《统计与决策》2022年第12期5-11,共7页Statistics & Decision

基  金:国家自然科学基金资助项目(11665019,11671268)。

摘  要:文章基于谱聚类算法,首先利用拉普拉斯矩阵的特征值构造聚类个数变点图,给出了确定聚类个数的直观方法,然后对优化目标引入聚类个数惩罚项,定量探讨聚类个数的选择,最后针对多元数据,通过修订距离矩阵处理成对约束信息,并基于距离矩阵构造了三种自适应相似度矩阵,再进行谱聚类。数值模拟结果显示:对于确定聚类个数,聚类个数变点图直观、有效,而惩罚法依赖惩罚项的权重参数,具有一定主观性;三种自适应谱聚类算法均有效,对成对约束信息处理方便、适应面广,稳定自适应谱聚类对近邻个数的选取更稳健。Based on the spectral clustering algorithm, this paper firstly uses the eigenvalues of the Laplacian matrix to construct the change point map of the number of clusters, and gives an intuitive method for determining the number of clusters, then introduces the penalty term of clustering numberto the optimization objective in order to quantitatively discuss the selection of the number of clusters. Finally, for multivariate data, the pairwise constraint information is processed by revising the distance matrix,and three adaptive similarity matrices are constructed based on the distance matrix, then spectral clustering performed. Numerical simulation shows that for determining the number of clusters, the change point map of the number of clusters is intuitive and effective, while the penalty method depends on the weight parameters of the penalty item, which is subjective, that the three adaptive spectral clustering algorithms are effective, also convenient for processing pairwise constraint information, with a wide range of adaptations, and thatthe stable adaptive spectral clustering is more robust in selecting the number of neighbors.

关 键 词:谱聚类 聚类个数 成对约束 自适应 

分 类 号:O212[理学—概率论与数理统计]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象