检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:石鸿雁[1] 徐明明 SHI Hong-yan;XU Ming-ming(School of Science,Shenyang University of Technology,Shenyang 110870,China)
机构地区:[1]沈阳工业大学理学院
出 处:《沈阳工业大学学报》2019年第5期555-559,共5页Journal of Shenyang University of Technology
基 金:国家自然科学基金资助项目(61074005)
摘 要:针对k-prototypes聚类算法随机选取初始聚类中心导致聚类结果不稳定,以及现有的大多数混合属性数据聚类算法聚类质量不高等问题,提出了基于平均差异度的改进k-prototypes聚类算法.通过利用平均差异度选取初始聚类中心,避免了初始聚类中心点选取的随机性,同时利用信息熵确定数值数据的属性权重,并对分类属性度量公式进行改进,给出了一种混合属性数据度量公式.结果表明,改进后的算法具有较高的准确率,能够有效处理混合属性数据.In order to solve the problem that the random selection of initial cluster centers for the k-prototypes clustering algorithm brings about unstable clustering results and that the clustering quality of most currently existing clustering algorithms for mixed attribute data is not high,an improved k-prototypes algorithm based on average difference degree was proposed.Through using the average difference degree,the initial clustering centers were selected to avoid the selection randomness of initial clustering center points.In addition,the attribute weights of numerical data were determined by the information entropy,the metric formula of categorical attribute was improved,and a metric formula for the mixed attribute data was given.The results show that the improved algorithm can achieve better accuracy and can effectively process the data of mixed attribute.
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:18.191.171.58