检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:陈素根 赵志忠 CHEN Sugen;ZHAO Zhizhong(School of Mathematics and Physics,Anqing Normal University,Anqing 246133,Anhui,China;Key Laboratory of Modeling,Simulation and Control of Complex Ecosystem in Dabie Mountains of Anhui Higher Education Institutes,Anqing 246133,Anhui,China)
机构地区:[1]安庆师范大学数理学院,安徽安庆246133 [2]安徽省大别山区域复杂生态系统建模、仿真与控制重点实验室,安徽安庆246133
出 处:《山东大学学报(工学版)》2025年第2期58-70,共13页Journal of Shandong University(Engineering Science)
基 金:国家自然科学基金青年基金资助项目(61702012);安徽省自然科学基金面上资助项目(2008085MF193);安徽省高等学校科学研究重点资助项目(2024AH051095)。
摘 要:针对密度峰值聚类算法定义的截断距离仅考虑样本全局分布,在样本分配时容易产生“多米诺骨牌”现象等问题,提出一种融合局部截断距离及小簇合并的密度峰值聚类算法。基于样本局部分布信息计算每个样本截断距离和局部密度,有利于准确获得复杂结构数据集上密度峰;根据样本决策值之间差值关系选择潜在密度峰并形成多个小簇;定义一种新的小簇间相似度,根据此相似度将小簇合并获得聚类结果,有效避免了“多米诺骨牌”现象。采用6个人工数据集和8个UCI数据集进行验证,所提算法在上述14个数据集上的标准化互信息、调整兰德系数和调整互信息平均值比5个对比算法平均提高18.15%、28.99%和20.22%,比原始密度峰值聚类算法提高30.06%,47.15%和31.90%,具有较好的聚类效果。Aiming at the problems that the truncation distance defined by the density peak clustering algorithm only considered the global distribution of samples and the"domino"phenomenon was easy to occur when assigning samples,a novel density peak clustering algorithm combining local truncation distance and small clusters merging was proposed.The truncation distance and local density of each sample were calculated based on the local distribution information of samples,which were conducive to accurately obtaining the density peaks on complex structure datasets.Potential density peaks were selected based on the difference between samples decision values and multiple small clusters were formed.A new kind of similarity between clusters was defined,and clusters were merged to obtain clustering results according to this similarity,which effectively avoided the"domino"phenomenon.Compared with several clustering algorithms on six synthetic datasets and eight UCI datasets,the standardized mutual information,adjusted rand index and adjusted mutual information average values of the proposed algorithm on 14 datasets were 18.15%,28.99%and 20.22%higher than the five comparison algorithms on average,especially 30.06%,47.15%and 31.90%higher than original density peak clustering algorithm.Experimental results showed the proposed algorithm had a good clustering effect.
关 键 词:聚类 密度峰值聚类 截断距离 局部密度 潜在密度峰
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.38