基于信息增益的模糊K-prototypes聚类算法

A fuzzy K-prototypes clustering algorithm based on information gain

机构地区：[1]广西科技大学计算机学院,广西柳州545006 [2]广西科技大学电气与信息工程学院,广西柳州545006

出　　处：《计算机工程与科学》2015年第5期1009-1014,共6页Computer Engineering & Science

基　　金：国家自然科学基金资助项目(61462008;61364006);广西自然科学基金资助项目(2013GXNSFAA019336);广西高校科学技术研究项目(LX2014190;YB2014210;LX2014190);广西科技大学科学基金资助项目(校科自1261128)

摘　　要：K-prototypes聚类算法结合了K-means算法和K-modes算法,可用于分析混合属性的数据对象。传统的K-prototypes聚类算法在计算数据对象的相异度时,未考虑各个属性对于最终聚类结果的影响程度,而现实世界中,各属性的重要程度是不同的。使用了信息论中信息增益的计算方法,来获得各个属性的权值。在计算各属性的差异度时,乘以这些权值,从而可以获得更为准确的聚类结果。为了增加算法处理模糊问题的能力,本算法引用了模糊理论,从而使其具有较好的抗干扰能力和处理不确定性问题的能力。通过对四个UCI数据集的聚类分析实验,表明了本算法的有效性。K-prototypes clustering algorithms combine K-means and K-modes to analyze mixed data objects. Classic K-prototypes clustering algorithms don＇t consider the effect degree of each attribute to the last clustering results when calculating the dissimilarity of data object. But in the real world,the im- portance of each attribute varies. In this paper we use information gain of the information theory to get the weight of each attribute. These weights are used to get a better clustering result when we calculate the dissimilarity. In order to improve the fuzzy ability, the proposed algorithm exploits the fuzzy theory to get a better capability for dealing with anti-noise and uncertain problems. Clustering experiments on four UCI data sets validate the effectiveness of our algorithm.

关键词：聚类信息增益模糊K-prototypes算法混合型数据

分类号：TP391[自动化与计算机技术—计算机应用技术]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于信息增益的模糊K-prototypes聚类算法

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于信息增益的模糊K-prototypes聚类算法

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索