检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:季铎[1] 刘云钊 彭如香[2] 孔华锋 Ji Duo;Liu Yunzhao;Peng Ruxiang;Kong Huafeng(Criminal Investigation Police University of China,Shenyang 110854,Liaoning,China;The Third Research Institute of Ministry of Public Security,Shanghai 201204,China;Wuhan Business University,Wuhan 430056,Hubei,China)
机构地区:[1]中国刑事警察学院,辽宁沈阳110854 [2]公安部第三研究所,上海201204 [3]武汉商学院,湖北武汉430056
出 处:《计算机应用与软件》2024年第10期282-286,318,共6页Computer Applications and Software
基 金:国家重点研发计划项目(2018YFC0830401);辽宁网络安全执法协同创新中心开放课题。
摘 要:K-means由于其时间复杂度低运行速度快一直是最为流行的聚类算法之一,但是该算法在进行聚类时需要预先给出聚类个数和初始类中心点,其选取得合适与否会直接影响最终聚类效果。该文对初始类中心和迭代类中心的选取进行大量研究,根据决策图进行初始类中心的选择,利用每个类簇的主题词向量替代均值作为迭代类中心。实验表明,该文的初始点选取方法能够准确地选取初始点,且利用主题词向量作为迭代类中心能够很好地避免噪声点和噪声特征的影响,很大程度上地提高了K-means算法的性能。K-means is one of the most popular clustering algorithms because of its low time complexity and fast running speed.However,K-means algorithm needs to give the number of clusters and the initial center points in advance when clustering,and its selection will directly affect the final clustering effect.In this paper,a lot of research has been done on the selection of initial class center and iterative class center.The initial cluster center was selected according to the decision diagram,and the subject word vector of each cluster was used instead of the mean value as the iterative cluster center.Experiments show that the initial point selection method in this paper can accurately select the initial point,and using the subject word vector as the iterative class center can well avoid the influence of noise points and noise features,and greatly improve the k-means clustering performance.
关 键 词:K-MEANS 初始点 决策图 迭代类中心 主题词向量
分 类 号:TP319[自动化与计算机技术—计算机软件与理论]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.200