检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:董华松[1] 连远锋[1] DONG Hua-song;LIAN Yuan-feng(College of Information Science and Engineering,China University of Petroleum(Beijing),Beijing 102249,China)
机构地区:[1]中国石油大学(北京)信息科学与工程学院,北京102249
出 处:《计算机仿真》2024年第5期460-464,共5页Computer Simulation
摘 要:与单一属性数据不同,混合属性数据通常存在尺度不一致的特点,为了可以得到准确率更高的混合属性聚类结果,提出一种基于k最近邻的混合属性聚类算法。采用高频系数滑动窗口准确估计含有噪声的混合属性数据噪声方差,通过BayesShrink阈值估计算法得到最佳阈值,对混合属性数据展开去噪。采用k最近邻方法展开数据聚类,在去噪后的数据样本贡献度中加入特征权重,并计算融入贡献度后的特征权重欧几里得距离,距离越近,说明数据属于同一类别的概率就越大,对全部样本特征展开加权处理后,构建混合属性聚类模型,利用粒子群算法对模型展开寻优,获取最优加权特征向量,实现混合属性数据聚类。仿真结果表明,所提算法可以有效提升混合属性聚类结果的精度和聚类效率。Unlike single attribute data,mixed attribute data usually has the characteristics of inconsistent scales.In order to obtain a more accurate mixed attribute clustering result,this paper put forward a mixed attribute clustering algorithm based on k-nearest neighbor.Firstly,the noise variance of mixed attribute data containing noise was accurately estimated using a high-frequency coefficient sliding window.Then,the optimal threshold was obtained through the Bayeshrink threshold estimation algorithm.Meanwhile,the mixed attribute data was denoised.Moreover,the knearest neighbor method was applied in data clustering,and the feature weight was added to the contribution of the denoised data samples.Furthermore,the Euclidean distance of the feature weight after incorporating the contribution was calculated.The closer the distance,the larger the probability that the data belonged to the same category.After all the sample features were weighted,a mixed attribute clustering model was constructed.Finally,the particle swarm optimization algorithm was used to optimize the model,thus obtaining the optimal weighted feature vector and realizing the clustering of mixed attribute data.Simulation results show that the proposed algorithm could effectively improve the accuracy and clustering efficiency of mixed attribute clustering results.
分 类 号:TP391[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.135.209.242