检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:周靖[1] 刘晋胜[1] Zhou Jing;Liu Jinsheng(College of Computer & Electronic Information, Guangdong University of Petrochemical Technology, Maoming Guangdong 525000 , China)
机构地区:[1]广东石油化工学院计算机与电子信息学院,广东茂名525000
出 处:《计算机应用研究》2016年第9期2634-2636,2678,共4页Application Research of Computers
基 金:国家自然科学基金资助项目(61473094)
摘 要:针对随机初始化方式对混合条件属性数据对象的适应调整能力非常低,且其任意性的本质特征,会造成聚类质量大幅度下降的缺陷,提出通过分类条件属性对象的熵值与数值条件属性对象的欧氏距离计算结果的对比,确定第一个簇中心元素的定位值;然后以迭代推理的方式评估混合条件属性对象间的距离及关系特性,获得下一个初始簇中心元素并依此类推的初始簇中心定位新算法NCBT(numeric-classification and between the two)。理论分析和实验表明,该算法平均定位准确率较随机初始化方式高出10个百分点,且具有良好的自适应性,能产生优良的聚类结果。For mixed data, the adjustment ability of the random initialization mode was lower, and it caused the clustering quality decreased greatly with the essential characteristics of arbitrariness. Aiming at the issues of the random initialization mode, this paper proposed a new localization method for initialization of cluster center NCBT. Through the comparison of the entropy of categorical data and the Euclidean distance calculation results, it confirmed the positioning values of the first cluster center element, obtained the next element of the initial cluster center by evaluating the distance and characteristics between mixed data with iterative reasoning, and so on. As shown in the theoretical analysis and the experimental results, in contrast with the random initialization mode, this method can improve the average accuracy of the location by more than 10 percentage points, and it has good adaptability with excellent clustering results.
分 类 号:TP301.6[自动化与计算机技术—计算机系统结构]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.117