检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:王俊[1] 杨茹 程显生 WANG Jun;YANG Ru;CHENG Xian-sheng(Inner Mongolia Agricultural University,Department of Computer Technology and Information Management,Baotou Inner Mongolia 014109,China)
机构地区:[1]内蒙古农业大学计算机技术与信息管理系,内蒙古包头014109
出 处:《计算机仿真》2020年第6期475-478,共4页Computer Simulation
摘 要:针对传统的云存储数据分段聚类方法存在运行效率较低、聚类结果不够平滑等问题,提出一种基于机器学习的云存储数据分段聚类方法。从云存储数据库中合理抽取多个小数据集,小数据集包含云存储数据库中的所有自然簇,根据相似度定义构建相似度矩阵。采用非线性核主成分算法实现对相似度矩阵中数据相似度的测度,通过相似度测度将具有相同特征的数据归为一类,采用混合高斯分布概率密度模型计算不同类别数据的后验概率,通过对概率大小的比较实现云存储数据分段聚类。实验结果证明,所提方法能够缩短聚类运行时间,将聚类变化度降低到29%,有效提高了聚类结果的平滑度。Traditionally,the segmentation clustering method for cloud storage data leads to low operational efficiency and unsmooth clustering results.Therefore,a segmental clustering method for cloud storage data based on machine learning was presented.Firstly,some small data sets were reasonably extracted from the cloud storage database,and the small data set included all natural clusters in cloud storage database.Secondly,the similarity matrix was constructed according to the definition of similarity.Thirdly,the nonlinear kernel principal component algorithm was used to measure the similarity of data in similarity matrix.Through the similarity measure,the data with the same characteristics were grouped together.Then,the mixed Gaussian distribution probability density model was used to calculate the posterior probability of different types of data.Finally,the segmental clustering of cloud storage data was achieved by comparing the probabilities.Simulation results show that the proposed method can shorten the clustering time and reduce the clustering degree to 29%,so that the smoothness of the clustering result is improved.
关 键 词:自然簇 相似度矩阵 非线性核主成分算法 混合高斯分布概率密度模型
分 类 号:TP391.41[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.222