基于阴影集的三支核均值漂移聚类算法  

Three-way Kernel Mean Shift Algorithm Based on Shadow Sets

在线阅读下载全文

作  者:马云洁 万仁霞 岳晓冬[3] MA Yunjie;WAN Renxia;YUE Xiaodong(School of Mathematics and Information Science,North Minzu University,Yinchuan 750021,China;Ningxia Key Laboratory of Intelligent Information and Data Processing,Yinchuan 750021,China;School of Computer Engineering and Science,Shanghai University,Shanghai 200444,China)

机构地区:[1]北方民族大学数学与信息科学学院,宁夏银川750021 [2]宁夏智能信息与大数据处理重点实验室,宁夏银川750021 [3]上海大学计算机工程与科学学院,上海200444

出  处:《山西大学学报(自然科学版)》2025年第1期169-179,共11页Journal of Shanxi University(Natural Science Edition)

基  金:国家自然科学基金(62066001);宁夏科技领军人才项目(2022GKLRLX08);宁夏自然科学基金(2021AAC03203)。

摘  要:均值漂移属于硬划分的聚类算法,在处理不确定性数据时可能导致决策风险的提高和聚类精度的降低等问题。为此,本文引入阴影集理论来处理三支聚类的数据对象分类问题,提出了一种基于阴影集的三支核均值漂移聚类算法。算法采用类归属概率来刻画阴影集的隶属度概念。通过优化算法来获得阴影集划分的最优阈值,有效减少了人为干预带来的不确定性。最后基于最优阈值,形成了以阴影集隶属为依据的三支聚类。在2个人工数据集和8个UCI公共数据集上算法进行测试。相较于均值漂移算法、带宽自适应均值漂移算法(Adaptive Bandwidth Mean Shift Algorithm,ABMS)以及核均值漂移算法(Kernel Mean Shift Algorithm,KMS),本文所提出的基于阴影集的三支核均值漂移聚类算法(Three-way Kernel Mean Shift Algorithm Based on Shadow Sets,TKMSSS)不仅可以对数据进行有效划分,而且可以很好地刻画类簇的边界域,在戴维森堡丁指数、轮廓系数、准确率、调整兰德系数、同质性等聚类评价指标方面均达到最优或与最优算法结果相近,表明TKMSSS综合聚类性能优于比较算法。Mean shift is a hard clustering algorithm.When dealing with uncertain data,it may lead to increased decision-making risks and reduced clustering accuracy.This paper introduces the shadow sets theory to address the data object classification problem in three-way clustering,and proposes a three-way mean shift clustering algorithm based on shadow sets.The proposed algorithm uses class belonging probability to represent the membership degree in the shadow sets.An optimization algorithm is employed to determine the optimal threshold for dividing the shadow sets,which effectively reduces the uncertainty caused by human intervention.Subsequently,a three-way clustering approach based on shadow sets membership is developed.The proposed algorithm is evaluated on 2 artificial datasets and 8 UCI public datasets.Compared to mean shift algorithm,adaptive bandwidth mean shift algorithm(ABMS)and kernel mean shift algorithm(KMS),the proposed algorithm TKMSSS can not only effectively divide the data,but also well describe the boundary region of the cluster.In terms of clustering evaluation indexes such as Divers-boding index,silhouette coefficient,accuracy,adjusted index coefficient and homogeneity,TKMSSS achieves better or similar results to the optimal algorithm,which indicates that the comprehensive clustering performance of TKMSSS is better than that of the comparison algorithms.

关 键 词:阴影集 三支聚类 类归属概率 优化算法 类簇 

分 类 号:TP391[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象