检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:张丹丹 游子毅[1] 郑建 陈世国[1] ZHANG Dan-dan;YOU Zi-yi;ZHENG Jian;CHEN Shi-guo(School of Physics and Electronic Sciences,Guizhou Normal University,Guiyang 550001,China;Guizhou Rural Credit Cooperative Association,Guiyang 550081,China)
机构地区:[1]贵州师范大学物理与电子科学学院,贵州贵阳550001 [2]贵州省农村信用社联合社,贵州贵阳550081
出 处:《微电子学与计算机》2019年第11期43-48,共6页Microelectronics & Computer
基 金:国家自然科学基金(61462015);贵州师范大学研究生创新基金项目(YC[2018]016);贵州省科技计划项目(黔科合LH字[2016]7223号)
摘 要:聚类分析在无监督学习领域中一直备受国内外学者关注.针对K-means聚类算法对初始聚类中心点敏感、簇内数据相关性差以及收敛到局部最优的缺点,提出了一种基于离群因子的优化聚类算法.该算法采用信息熵加权欧式距离作为相似性度量依据,以更明显地区分数据对象间的差异,然后利用k距离参数自调整的局部异常因子检测算法计算出各数据点的离群因子并筛选出初始聚类中心的候选集,最后根据其离群因子加权距离法优化聚类中心.通过在UCI数据集上的实验测试结果表明,优化算法的准确率比K-means++算法、OFMMK-means算法、FCM算法更高,运行速度比FCM算法更快.该算法能够更好地应用于入侵行为检测、信用风险评估以及多故障诊断等领域.Cluster analysis has been concerned by scholars at home and abroad in the field of unsupervised learning. Aiming at the disadvantages of K-means clustering algorithm for initial clustering center point sensitivity, poor data correlation in clusters and convergence to local optimization, an optimized clustering algorithm based on outlier factor is proposed in this paper. The algorithm firstly takes the information entropy weighted European distance as the basis of similarity measurement, in order to distinguish the difference between the data objects more obviously, then calculates the outlier factor of each data point by using the k distance parameter self-adjusting of the Local Outlier Factor algorithm and selects the candidate set of the initial clustering center, and finally optimizes the clustering center according to the outlier factor weighted distance method. The experimental results on UCI DataSet show that the accuracy of the optimization algorithm is higher than that of k-means++ algorithm, OFMMK-means algorithm and FCM algorithm, and its running speed is faster than the FCM algorithm. The algorithm can be better used in intrusion behavior detection, credit risk assessment and multi-fault diagnosis.
关 键 词:聚类 Kmeans 加权欧式距离 LOF算法 优化
分 类 号:TP181[自动化与计算机技术—控制理论与控制工程]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:18.119.131.79