Clustering Categorical Data Based on Within-Cluster Relative Mean Difference  

Clustering Categorical Data Based on Within-Cluster Relative Mean Difference

在线阅读下载全文

作  者:Jinxia Su Chunjing Su 

机构地区:[1]School of Mathematics and Statistics, Lanzhou University, Lanzhou, China

出  处:《Open Journal of Statistics》2017年第2期173-181,共9页统计学期刊(英文)

摘  要:The clustering on categorical variables has received intensive attention. In dataset with categorical features, some features show the superior performance on clustering procedure. In this paper, we propose a simple method to find such distinctive features by comparing pooled within-cluster mean relative difference and then partition the data upon such features and give subspace of the subgroups. The applications on zoo data and soybean data illustrate the performance of the proposed method.The clustering on categorical variables has received intensive attention. In dataset with categorical features, some features show the superior performance on clustering procedure. In this paper, we propose a simple method to find such distinctive features by comparing pooled within-cluster mean relative difference and then partition the data upon such features and give subspace of the subgroups. The applications on zoo data and soybean data illustrate the performance of the proposed method.

关 键 词:CLUSTERING CATEGORICAL Variable Distinctive Attribute Pooled Within-Cluster Mean RELATIVE DIFFERENCE Hamming Distance 

分 类 号:R73[医药卫生—肿瘤]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象