检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:熊君竹 何振峰[1] XIONG Jun-Zhu;HE Zhen-Feng(College of Computer and Data Science,Fuzhou University,Fuzhou 350108,China)
机构地区:[1]福州大学计算机与大数据学院,福州350108
出 处:《计算机系统应用》2022年第6期175-181,共7页Computer Systems & Applications
基 金:福建省自然科学基金(2018J01794)。
摘 要:以K-means为代表的聚类算法被广泛地应用在许多领域,但是K-means不能直接处理不完整数据集.k_(m)-means是一种处理不完整数据集的聚类算法,通过调整局部距离计算方式,减少不完整数据对聚类过程的影响.然而k_(m)-means初始化阶段选取的聚类中心存在较大的不可靠性,容易陷入局部最优解.针对此问题,本文引入可信度,提出了结合可信度的k_(m)-means聚类算法,通过可信度调整距离计算,增大初始化过程中选取聚类中心的可靠性,提高聚类算法的准确度.最后,通过UCI和UCR数据集验证算法的有效性.The clustering algorithm represented by K-means is widely used in many fields, but K-means cannot directly deal with incomplete data. k_(m)-means is a clustering algorithm for processing incomplete data. It reduces the impact of incomplete data on the clustering process by adjusting the calculation method of partial distance. However, the centroids selected in the initialization stage of k_(m)-means are unreliable, resulting in local optimal solutions. For incomplete data, a clustering algorithm that combined credibility was proposed to solve this problem. The calculation of distance was adjusted by credibility to increase the reliability of cluster centroids in the initialization stage, improving the performance of clustering algorithm. Finally, the algorithm was verified by the experimental results from the UCI and UCR dataset.
关 键 词:不完整数据 聚类中心 可信度 局部距离 K-MEANS
分 类 号:TP311.13[自动化与计算机技术—计算机软件与理论] TP181[自动化与计算机技术—计算机科学与技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.15