基于簇内簇间相异度的k-modes算法  被引量:1

k-modes clustering algorithm based on dissimilarity of intra-cluster and inter-cluster

在线阅读下载全文

作  者:贾子琪 宋玲[2] JIA Zi-qi;SONG Ling(School of Computer and Software,Nanyang Institute of Technology,Nanyang 473004,China;School of Computer,Electronics and Information,Guangxi University,Nanning 530004,China)

机构地区:[1]南阳理工学院计算机与软件学院,河南南阳473004 [2]广西大学计算机与电子信息学院,广西南宁530004

出  处:《计算机工程与设计》2021年第9期2492-2500,共9页Computer Engineering and Design

基  金:国家自然科学基金项目(61762030);广西创新驱动重大专项基金项目(桂科AA17204017);广西重点研发计划基金项目(桂科AB19110050、桂科AB18126094)。

摘  要:为提高k-modes算法的精度并解决初始簇中心选择问题,提出一种基于簇内簇间相异度的k-modes算法(IKMCA)。基于簇内簇间相似性对相异度系数进行改进,给出初始簇中心自主选择的具体方法。提出的簇内簇间相异度系数考虑特征值本身的相异性与其它相关特征对它们的区分性。提出的初始簇中心自主选择方法可以自动确定聚类个数和初始簇中心位置。实验结果表明,提出算法在聚类精度、纯度、召回率上均优于经典k-modes算法及其变体算法。To increase the accuracy of k-modes algorithm and to solve the problem of the selection of the initial cluster centers,a k-modes clustering algorithm based on the dissimilarity of the intra-cluster and inter-cluster(IKMCA)was proposed.The dissimilarity was improved according to the similarity between the intra-cluster and inter-cluster and a specific method was provided for the self-determined selection of the initial cluster centers.This intra-cluster and inter-cluster dissimilarity not only took the dissimilarity of the characteristic values themselves into consideration,but also paid attention to their differentiation from other related characteristics.The self-determined selection of the initial cluster centers could automatically determine the number and the location of the initial cluster centers.Experimental results show that IKMCA algorithm is superior to the classic k-modes algorithm and its variants in clustering accuracy,purity and recall rate.

关 键 词:k模式算法 簇内簇间相似性 分类型数据 频率 相异度系数 

分 类 号:TP181[自动化与计算机技术—控制理论与控制工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象