基于半监督K-Means的属性加权聚类算法  被引量:6

ATTRIBUTE-WEIGHTED CLUSTERING ALGORITHM BASED ON SEMI-SUPERVISED K-MEANS

在线阅读下载全文

作  者:潘巍[1,2,3] 周晓英[1,2,3] 吴立锋[1,2,3] 王国辉[1,2,3] Pan Wei Zhou Xiaoying Wu Lifeng Wang Guohui(College of Information Engineering, Capital Normal University,Beijing 100048, China Bering Engineering Research Center of High Reliable Embedded System, Capital Normal University,Beijing 100048, China 2.Beijing Key Laboratory of Electronic System Reliable Technology, Capital Normal University,Beijing 100048, Chin)

机构地区:[1]首都师范大学信息工程学院,北京100048 [2]首都师范大学北京市高可靠嵌入式系统工程研究中心,北京100048 [3]首都师范大学北京市电子系统可靠性技术研究重点实验室,北京100048

出  处:《计算机应用与软件》2017年第3期189-193,242,共6页Computer Applications and Software

基  金:国家自然科学基金项目(61202027);北京市属高等学校创新团队建设与教师职业发展计划项目(IDHT20150507)

摘  要:K-Means是经典的非监督聚类算法,因其速度快,稳定性高广泛应用在各个领域。但传统的K-Means没有考虑无关属性以及噪声属性的影响,并且不能自动寻找聚类数目K。而目前K-Means的改进算法中,也鲜有关于高维以及噪声方面的改进。因此,结合PCA提出基于半监督的K-Means加权属性聚类方法。首先,用PCA得到更少更有效的特征,并计算它们的分类贡献率(即每个特征对聚类的影响因子)。其次,由半监督自适应算法得到K。最后将加权数据集以及K应用到聚类中。实验表明,该算法具有更好的识别率和普适性。K-Means is a classic unsupervised clustering algorithm which is widely applied in various fields for its high speed and high stability. However, the traditional K-Means methods do not take unrelated attributes and the impact of noise into consideration. They also cannot automatically look for the number of clusters K. At present, the improved K-Means algorithms also rarely focus on high-dimensional data and noise attributes. Therefore, this paper proposes an attribute-weighted clustering algorithm based on semi-supervised K-Means associated with PCA. Firstly, the dimension reduction is achieved by introducing PCA, and the contribution rate of each dimension classification characteristics ( the impact factor of clustering processed by each feature attribute) is calculated. Secondly, the number of clusters K is obtained through an adaptive semi-supervised algorithm. Finally, the weighted Experimental results show that the proposed method has better recognition rate data sets and K are applied to clustering. and universality,

关 键 词:均值 聚类 半监督 主成分分析 属性加权 

分 类 号:TP301.6[自动化与计算机技术—计算机系统结构]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象