检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:张清华[1,2] 周靖鹏 代永杨 王国胤[1,2] ZHANG Qing-Hua;ZHOU Jing-Peng;DAI Yong-Yang;WANG Guo-Yin(Key Laboratory of Tourism Multisource Data Perception and Decision,Ministry of Culture and Tourism(Chongqing University of Posts and Telecommunications),Chongqing 400065,China;Chongqing Key Laboratory of Computational Intelligence(Chongqing University of Posts and Telecommunications),Chongqing 400065,China)
机构地区:[1]旅游多源数据感知与决策技术文化和旅游部重点实验室(重庆邮电大学),重庆400065 [2]计算智能重庆市重点实验室(重庆邮电大学),重庆400065
出 处:《软件学报》2023年第12期5629-5648,共20页Journal of Software
基 金:国家重点研发计划(2020YFC2003502);国家自然科学基金(61876201);重庆市自然科学基金(cstc2019jcyj-cxttX0002,cstc2021ycjh-bgzxm0013);重庆市教委重点合作项目(HZ2021008)。
摘 要:密度峰值聚类(density peaks clustering,DPC)是一种基于密度的聚类算法,该算法可以直观地确定类簇数量,识别任意形状的类簇,并且自动检测、排除异常点.然而,DPC仍存在些许不足:一方面,DPC算法仅考虑全局分布,在类簇密度差距较大的数据集聚类效果较差;另一方面,DPC中点的分配策略容易导致“多米诺效应”.为此,基于代表点(representative points)与K近邻(K-nearest neighbors,KNN)提出了RKNN-DPC算法.首先,构造了K近邻密度,再引入代表点刻画样本的全局分布,提出了新的局部密度;然后,利用样本的K近邻信息,提出一种加权的K近邻分配策略以缓解“多米诺效应”;最后,在人工数据集和真实数据集上与5种聚类算法进行了对比实验,实验结果表明,所提出的RKNN-DPC可以更准确地识别类簇中心并且获得更好的聚类结果.Density peaks clustering(DPC)is a density-based clustering algorithm that can intuitively determine the number of clusters,identify clusters of any shape,and automatically detect and exclude abnormal points.However,DPC still has some shortcomings:The DPC algorithm only considers the global distribution,and the clustering performance is poor for datasets with large cluster density differences.In addition,the point allocation strategy of DPC is likely to cause a Domino effect.Hence,this study proposes a DPC algorithm based on representative points and K-nearest neighbors(KNN),namely,RKNN-DPC.First,the KNN density is constructed,and the representative points are introduced to describe the global distribution of samples and propose a new local density.Then,the KNN information of samples is used to propose a weighted KNN allocation strategy to relieve the Domino effect.Finally,a comparative experiment is conducted with five clustering algorithms on artificial datasets and real datasets.The experimental results show that the RKNN-DPC algorithm can more accurately identify cluster centers and obtain better clustering results.
关 键 词:聚类分析 密度峰值聚类 代表点 K近邻(KNN)
分 类 号:TP18[自动化与计算机技术—控制理论与控制工程]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:18.119.103.40