基于残差分析的混合属性数据聚类算法被引量：12

Clustering Algorithm for Mixed Data Based on Residual Analysis

作　　者：邱保志[1] 张瑞霖李向丽[1] QIU Bao-Zhi;ZHANG Rui-Lin;LI Xiang-Li(School of Information Engineering,Zhengzhou University,Zhengzhou 450001)

机构地区：[1]郑州大学信息工程学院,郑州450001

出　　处：《自动化学报》2020年第7期1420-1432,共13页Acta Automatica Sinica

基　　金：河南省基础与前沿技术研究项目(152300410191)资助。

摘　　要：针对混合属性数据聚类结果精度不高、聚类结果对参数敏感等问题,提出了基于残差分析的混合属性数据聚类算法(Clustering algorithm for mixed data based on residual analysis)RA-Clust.算法以改进的熵权重混合属性相似性度量对象间的相似性,以提出的基于KNN和Parzen窗的局部密度计算方法计算每个对象的密度,通过线性回归和残差分析进行聚类中心预选取,然后以提出的聚类中心目标优化模型确定真正的聚类中心,最后将其他数据对象按照距离高密度对象的最小距离划分到相应的簇中,形成最终聚类.在合成数据集和UCI数据集上的实验结果验证了算法的有效性.与同类算法相比,RA-Clust具有较高的聚类精度.For the existing mixed data clustering algorithm,there are some problems such as low clustering accuracy and parameters sensitive,a clustering algorithm for mixed data based on residual analysis(RA-Clust)is proposed.We use entropy weight to measure the similarity between objects with mixed attributes.Based on KNN and Parzen windows,we propose a method to calculate the local density of objects.Pre-selected cluster centers is conducted by linear regression and residual analysis.Then,the true cluster centers are selected according to objective optimization model proposed in this paper.Finally,the remaining objects are assigned into corresponding clusters according to the minimum distance from the high density objects.The experimental results on synthetic datasets and UCI datasets verify the effectiveness.Compared with similar algorithms,RA-Clust has a higher clustering accuracy.

关键词：聚类残差分析线性回归混合属性数据集聚类中心

分类号：TP311.13[自动化与计算机技术—计算机软件与理论]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于残差分析的混合属性数据聚类算法被引量：12

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于残差分析的混合属性数据聚类算法 被引量：12

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

基于残差分析的混合属性数据聚类算法被引量：12