加权成对约束投影半监督聚类  被引量:2

Semi-supervised clustering with weighted pairwise constraints projection

在线阅读下载全文

作  者:潘俊[1] 孔繁胜[1] 王瑞琴[2] 

机构地区:[1]浙江大学计算机科学与技术学院,浙江杭州310027 [2]温州大学物理与电子信息工程学院,浙江温州325035

出  处:《浙江大学学报(工学版)》2011年第5期934-940,共7页Journal of Zhejiang University:Engineering Science

摘  要:为了充分挖掘成对约束所隐含的信息来指导数据降维和数据聚类,提出一种基于加权成对约束投影的半监督聚类方法.该方法构造成对约束信息的k最近邻集并扩充成对约束集,分析成对约束实例包含的信息量并构造权系数矩阵,在加权成对约束信息的指导下求得投影矩阵,通过投影矩阵将样本数据投影到低维空间,使类内各点紧密分布,类间各点分散分布.同时,通过一种新的评价函数对k均值聚类算法进行改进,能够在尽量不违反成对约束的情况下优化聚类性能,实验结果表明,与现有半监督降维聚类算法相比,新方法能以较低的开销对高维数据进行聚类.In order to utilize pairwise constraints to full extent in the process of dimension reduction and data clustering, a novel approach called semi-supervised clustering with weighted pairwise constraints projection was developed. The new method expanded the original constraints set by k nearest neighbors of the pairwise constraints, then assigned weights to each pairwise constraint by its information power, and finally found a proper projection matrix guided by the weighted pairwise constraints. With the projection matrix, all the data were projected onto a low-dimensional manifold, so that the intra - class distance is decreased and the inter-class distance is increased. In addition, a new evaluation function was introduced to enforce the k means cluster algorithm, which had enabled it to provide an appealing clustering performance with minimum violation of the pairwise constraints. Experimental results on real world datasets demonstrate the proposed algorithm can deal with high-dimensional data at lower cost compared to state-of-theart semi-supervised algorithms.

关 键 词:半监督聚类 成对约束 投影矩阵 K均值算法 

分 类 号:TP181[自动化与计算机技术—控制理论与控制工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象