检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:陈超泉 王佳明[2] 谢晓兰 CHEN Chao-quan;WANG Jia-ming;XIE Xiao-lan(Guangxi Key Laboratory of Embedded Technology and Intelligent Systems, Guilin 514004, China;College of Information Science and Engineering, Guilin University of Technology, Guilin 514004, China)
机构地区:[1]广西嵌入式技术与智能系统重点实验室,桂林541006 [2]桂林理工大学信息科学与工程学院,桂林541006
出 处:《科学技术与工程》2022年第10期4011-4018,共8页Science Technology and Engineering
基 金:国家自然科学基金(61762031);广西科技重大专项(AA19046004);广西重点研发计划(AB18126006)。
摘 要:密度峰值算法依赖于欧式距离实现局部密度的选择,该算法在处理高维数据、存在密度不均匀的类簇的数据集上效果不是很理想。针对以上问题,提出一种融合流形距离与标签传播的改进密度峰值聚类算法(improved density peak clustering combining manifold distance and label propagation,DPC-ML)。DPC-ML使用流形距离进行距离度量并形成流形距离矩阵,同时定义了一种局部密度,将流形距离与局部密度融合,让局部密度反映出一定的局部距离信息。实验数据表明该算法在处理不同形状,密度不均匀的类簇上有着良好的性能。而且通过绘制决策图发现在不同的人工数据集上的实验显示DPC-ML算法重新定义的局部密度对于类簇中心点的选择区分度更高。由于引入了新的参数邻近点数,故也探究了邻近点数对聚类结果的影响,发现在聚类指标在刚成为连通图时效果最好,进一步证明了流形距离可以对聚类结果性能有所提高。The Euclidean distance is dependent on the local density in the density peaking algorithm.The algorithm is not very effective in dealing with datasets with high-dimensional data and the presence of class clusters with inhomogeneous density.To address these problems,improved density peak clustering combining manifold distance and label popagation(DPC-ML)was proposed.The distance metric in DPC-ML was defined by the manifold distance and forms the manifold distance matrix.And a local density was redefined to fuse the manifold distance with the local density,so that the local density reflects certain local distance information.The experimental data show that the algorithm has good performance in dealing with different shapes and uneven density of class clusters.Moreover,the selection of class cluster centroids was found to be more discriminative in terms of the redefined local density in decision maps drawn by using the DPC-ML algorithm on different artificial data sets.Since a new parameter neighborhood points is introduced,the effect of neighborhood points on clustering results is also explored,and it is found that the clustering index works best when it just becomes a connected graph,further demonstrating that the performance of the clustering results can be improved by the manifold distance.
分 类 号:TP301.6[自动化与计算机技术—计算机系统结构]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.170