基于广义线性模型的混合属性数据聚类方法  被引量:4

The Mixed Attribute Data Clustering Method on Generalized Linear Model

在线阅读下载全文

作  者:朱永杰 ZHU Yong-jie(Information Management Center,Xuchang University,Xuchang 461099,China)

机构地区:[1]许昌学院信息化管理中心,许昌461099

出  处:《科学技术与工程》2021年第4期1448-1453,共6页Science Technology and Engineering

基  金:许昌学院科技处重点课题(2019044)。

摘  要:针对混合属性数据聚类难度高的问题,提出一种基于广义线性模型的混合属性数据聚类方法。首先,构建低阶多元广义线性模型处理海量数据聚类问题,考虑数据属性的时间特性,获取属性时间序列矩阵;然后,基于优化K-prototypes聚类方法处理混合属性数据时,考虑属性的时间序列矩阵;最后,在考虑样本同聚类中心距离基础上兼顾已知样本信息内容,采用优化方法计算数据相异度、样本与聚类集间距离,当聚类结果趋于平稳时终止运算,输出聚类结果。为验证基于广义线性模型的混合属性数据聚类方法的有效性展开实验分析。结果显示,该方法经过较少次迭代即可优化划分混合属性数据聚类集,聚类适应度值为0.88~0.94,适应度优,可准确体现样本间差异,是一种准确度高的混合属性数据聚类方法。To solve the problem of difficulty in data clustering of mixed attributes,a new method to solve this problem was proposed based on generalized linear model.First,a low-order multivariate generalized linear model was constructed to deal with massive data clustering and then the time series matrix of attributes was considered when processing mixed attribute data based on optimized K-prototypes clustering method.Finally,considering the distance between the samples and the clustering center,the known sample information content was taken into account.The data dissimilarity and the distance between samples and clustering sets were calculated by using the optimization method.When the clustering results tend to be stable,the operation could be terminated and the clustering results were output.To verify the effectiveness of the generalized-linear-model-based mixed attribute data clustering method,an experimental analysis was performed.Results show that the method was able to optimize the clustering of mixed attribute data by fewer iterations,the clustering fitness values were between 0.88~0.94,and the fitness degree was good.Therefore,the new method can accurately reflect the differences among samples and is a high-accuracy mixed attribute data clustering method.

关 键 词:广义线性模型 混合属性 数据 时间序列矩阵 K-prototypes聚类 迭代 

分 类 号:TP181[自动化与计算机技术—控制理论与控制工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象