检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:万静[1] 郑龙君 何云斌[1] 李松[1] WAN Jing;ZHENG Longjun;HE Yunbin;LI Song(School of Computer Science and Technology,Harbin University of Science and Technology,Harbin Heilongjiang 150080,China)
机构地区:[1]哈尔滨理工大学计算机科学与技术学院
出 处:《计算机应用》2019年第11期3280-3287,共8页journal of Computer Applications
基 金:国家自然科学基金资助项目(61872105);黑龙江教育厅科学技术研究项目(1253lz004);黑龙江省留学归国人员科学基金资助项目(LC2018030)~~
摘 要:如何降低不确定数据对高维数据聚类的影响是当前的研究难点。针对由不确定数据与维度灾难导致的聚类精度低的问题,采用先将不确定数据确定化,后对确定数据聚类的方法。在将不确定数据确定化的过程中,将不确定数据分为值不确定数据与维度不确定数据,并分别处理以提高算法效率。采用结合期望距离的K近邻(K NN)查询得到对聚类结果影响最小的不确定数据近似值以提高聚类精度。在得到确定数据之后,采用子空间聚类的方式避免维度灾难的影响。实验结果证明,基于Clique的高维不确定数据聚类算法(UClique)在UCI数据集上有较好的表现,有良好的抗噪声能力和伸缩性,在高维数据上能得到较好的聚类结果,在不同的不确定数据集实验中能够得到较高精度的实验结果,体现出算法具有一定的健壮性,能够有效地对高维不确定数据集聚类。How to reduce the impact of uncertain data on high dimensional data clustering is the difficulty of current research.Aiming at the problem of low clustering accuracy caused by uncertain data and curse of dimensionality,the method of determining the uncertain data and then clustering the certain data was adopted.In the process of determining the uncertain data,uncertain data were divided into value uncertain data and dimension uncertain data,and were processed separately to improve algorithm efficiency.K-Nearest Neighbor(K NN)query combined with expected distance was used to obtain the approximate value of uncertain data with the least impact on the clustering results,so as to improve the clustering accuracy.After determining the uncertain data,the method of subspace clustering was adopted to avoid the impact of the curse of dimensionality.The experimental results show that high-dimensional uncertain data clustering algorithm based on Clique for Uncertain data(UClique)has good performance on UCI datasets,has good anti-noise performance and scalability,can obtain better clustering results on high dimensional data,and can achieve the experimental results with higher accuracy on different uncertain datasets,showing that the algorithm is robust and can effectively cluster high dimensional uncertain data.
分 类 号:TP311.13[自动化与计算机技术—计算机软件与理论]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.31