检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
机构地区:[1]南通大学计算机科学与技术学院,南通226019
出 处:《科学技术与工程》2013年第34期10359-10363,10368,共6页Science Technology and Engineering
基 金:南通大学自然科学研究项目(12Z037);南通大学校级自然科学类科研基金交通运输专项项目(10ZJ002)资助
摘 要:K-均值聚类算法在当前提取数据挖掘的聚类分析方法中已经取得了一定的成就,为了进一步改进其在数据预处理及神经网络结构中的应用,文中对算法进行了缺陷研究,主要做了以下几个方面的工作:对K-均值算法进行了思路及算法主要流程分析;得出K-均值聚类算法存在简单、迅速、结果簇密集、簇与簇之间区别较为明显等优点;分析得出算法存在与处理符号属性的数据不太适应、必须事先给出k值(想要生成的簇的个数)、对"噪声数据"以及孤立的点数据有较大影响、需要不断计算更新调整后的新聚类中心等缺点。在实验验证中结果得出:聚类结果可知,选取不同的值初始值对聚类结果的影响很小;如果聚类数据集迭代次数较多时,可以尝试着改变其数据的输入顺序;变动数据集的输入顺序,会直接影响聚类结果。实验结果对于K-均值算法的工作效率提高了,具有明显的参考价值,这一研究对于数据挖掘技术的改进具有一定的意义。K-means clustering algorithm to extract the current data mining clustering analysis method has achieved some success,in order to further improve their performance in data preprocessing and neural network structure in the application of the text of the algorithm for defect studies the following major aspects of work.The Kmeans algorithm for ideas and algorithms are mainly process analysis; draw K-means clustering algorithm there is a simple,rapid,result clusters dense clusters and clusters such as the more obvious differences between benefits; analysis and processing algorithm has obtained data symbol attributes are not accustomed to,which must be given the value of k (number of clusters you want to generate).On the "noise data" as well as isolated point data have a greater impact,need to constantly updated adjusting the new computing cluster center and other shortcomings.The results obtained in the experimental verification:clustering results,select a different value of the initial value has little effect on the clustering results; clustering data set if the number of iterations is large,you can try to change its data input order.Change data set the input sequence,it will directly affect the clustering results.The results for the K-means algorithm work efficiency has obvious reference value,this study for the improvement of data mining technology has a certain significance.
分 类 号:TP391.3[自动化与计算机技术—计算机应用技术]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:18.222.135.39