检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:李启文 王治和 杜辉 鲁德鹏 LI Qiwen;WANG Zhihe;DU Hui;LU Depeng(School of Computer Science and Engineering,Northwest Normal University,Lanzhou 730070,Gansu,China)
机构地区:[1]西北师范大学计算机科学与工程学院,甘肃兰州730070
出 处:《计算机工程》2025年第4期137-148,共12页Computer Engineering
基 金:国家自然科学基金(62372353)。
摘 要:密度峰值聚类(DPC)算法可以发现任意形状的簇,对噪声具有鲁棒性,因此被广泛应用于各个领域。但DPC算法需要人工选取聚类中心,对于密度不均匀型数据集表现较差。为此,提出一种基于高斯分布的自适应密度峰值聚类算法。首先,计算局部密度和相对距离的乘积θ_(i),通过Z-score标准化方法,将θ_(i)映射到符合高斯分布的二维空间中,利用高斯分布的标准偏差来自适应选取聚类中心,得到聚类中心集合;其次,将其余数据点分配到离其最近的聚类中心所在的簇中,得到初步划分结果;最后,设计缝合因子模型,计算簇间缝合系数,当缝合系数大于阈值时合并初步划分结果中最相似簇并更新相似度矩阵,直至完成合并得到最终结果。在人工数据集和真实数据集上的实验结果表明,与DBSCAN算法、DPC算法和ICKDC算法对比,所提算法的聚类准确度更高,聚类性能更佳。The Density Peak Clustering(DPC)algorithm excels in diverse fields,is adept at identifying clusters of any shape,and is noise-resistant.However,the algorithm needs help with manual cluster center selection and underperforms on datasets with uneven densities.This paper introduces a novel Gaussian distribution-based adaptive DPC algorithm to overcome these challenges.This approach involves multiplying the local density by the relative distance θ_(i) and mapping this θ_(i)into a two-dimensional Gaussian space using Z-score standardization.Uniquely,the algorithm adaptively selects cluster centers based on the standard deviation of the Gaussian distribution and assigns data points to their nearest centers for initial clustering.This paper also introduces a suture factor model to facilitate the merging of similar sub-clusters.When the suture coefficient is greater than the threshold,merge the most similar clusters in the preliminary partition results and update the similarity matrix until the merging process is completed to obtain the final result.The experimental results on artificial and real datasets indicate that compared with DBSCAN algorithm,DPC algorithm,and ICKDC algorithm,the proposed algorithm has higher clustering accuracy and better clustering performance.
关 键 词:密度峰值聚类算法 高斯分布 Z-score标准化 缝合因子 簇间相似度
分 类 号:TP301.6[自动化与计算机技术—计算机系统结构]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.200