基于高斯混合模型的相关子空间投影聚类分析  

Projection Clustering Analysis Algorithm Based on Gaussian Mixture Model and Correlated Subspaces

在线阅读下载全文

作  者:武政平 荀亚玲[1] WU Zheng-ping;XUN Ya-ling(School of Computer Science and Technology,Taiyuan University of Science and Technology,Taiyuan 030024,China)

机构地区:[1]太原科技大学计算机科学与技术学院,太原030024

出  处:《太原科技大学学报》2021年第5期386-392,共7页Journal of Taiyuan University of Science and Technology

基  金:国家青年科学基金(61602335);山西省自然科学基金(201901D211302);太原科技大学博士科研启动基金(20172017)。

摘  要:针对高维数据出现的“维灾”、稀疏性问题及各属性维自身具有的特点,采用高斯混合模型定义的相关子空间,给出一种投影聚类分析算法。首先,采用KNN,得到各个数据对象的局部数据集LDS,并引入稀疏因子生成稀疏度矩阵,而后依据高斯混合模型和稀疏度矩阵,识别出相关子空间和不相关子空间;其次,根据相似性度量,剔除稀疏数据和无关属性维,并利用K-means算法形成聚类簇;最后,采用UCI数据集,验证了该算法的有效性与准确性。According to a series of problems in high-dimensional data,such as“dimensional disasters”,sparseness problems,and the characteristics of each attribute dimension,a projection clustering analysis algorithm is given using the relevant subspaces defined by the Gaussian mixture model.Firstly,the local data set LDS of each data object is obtained by K-Nearest Neighbors.Sparse degree matrix is calculated using the data object s attribute sparse degree,which can reflect sparse and dense of dataset.The correlated and uncorrelated subspace are identified by Gaussian mixture model and the sparseness matrix.Then,the sparse data and the irrelevant attribute dimensions are eliminated according to the similarity measure.Subsequently,clusters are generated using K-means algorithm.Finally,UCI dataset is used to verify the effectiveness and accuracy of the algorithm.

关 键 词:相关子空间 高斯混合模型 稀疏度 投影聚类 高维数据 

分 类 号:TP311[自动化与计算机技术—计算机软件与理论]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象