一种面向混合属性对象的初始簇中心定位的新算法  

New localization initialization of centers of cluster algorithm with mixed data

在线阅读下载全文

作  者:周靖[1] 刘晋胜[1] Zhou Jing;Liu Jinsheng(College of Computer & Electronic Information, Guangdong University of Petrochemical Technology, Maoming Guangdong 525000 , China)

机构地区:[1]广东石油化工学院计算机与电子信息学院,广东茂名525000

出  处:《计算机应用研究》2016年第9期2634-2636,2678,共4页Application Research of Computers

基  金:国家自然科学基金资助项目(61473094)

摘  要:针对随机初始化方式对混合条件属性数据对象的适应调整能力非常低,且其任意性的本质特征,会造成聚类质量大幅度下降的缺陷,提出通过分类条件属性对象的熵值与数值条件属性对象的欧氏距离计算结果的对比,确定第一个簇中心元素的定位值;然后以迭代推理的方式评估混合条件属性对象间的距离及关系特性,获得下一个初始簇中心元素并依此类推的初始簇中心定位新算法NCBT(numeric-classification and between the two)。理论分析和实验表明,该算法平均定位准确率较随机初始化方式高出10个百分点,且具有良好的自适应性,能产生优良的聚类结果。For mixed data, the adjustment ability of the random initialization mode was lower, and it caused the clustering quality decreased greatly with the essential characteristics of arbitrariness. Aiming at the issues of the random initialization mode, this paper proposed a new localization method for initialization of cluster center NCBT. Through the comparison of the entropy of categorical data and the Euclidean distance calculation results, it confirmed the positioning values of the first cluster center element, obtained the next element of the initial cluster center by evaluating the distance and characteristics between mixed data with iterative reasoning, and so on. As shown in the theoretical analysis and the experimental results, in contrast with the random initialization mode, this method can improve the average accuracy of the location by more than 10 percentage points, and it has good adaptability with excellent clustering results.

关 键 词:混合条件属性对象 距离  迭代 初始簇中心 

分 类 号:TP301.6[自动化与计算机技术—计算机系统结构]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象