基于贝叶斯网络的半监督聚类集成模型  被引量:9

Semi-Supervised Cluster Ensemble Model Based on Bayesian Network

在线阅读下载全文

作  者:王红军[1,2] 李志蜀[2] 戚建淮[1] 成飏[2] 周鹏[2] 周维[2] 

机构地区:[1]西南交通大学信息化研究院,四川成都610031 [2]四川大学计算机学院,四川成都610054

出  处:《软件学报》2010年第11期2814-2825,共12页Journal of Software

基  金:国家自然科学基金No.61003142;国家铁道部资助项目Nos.2009X010-A;2009X010-B~~

摘  要:已有的聚类集算法基本上都是非监督聚类集成算法,这样不能利用已知信息,使得聚类集成的准确性、鲁棒性和稳定性降低.把半监督学习和聚类集成结合起来,设计半监督聚类集成模型来克服这些缺点.主要工作包括:第一,设计了基于贝叶斯网络的半监督聚类集成(semi-supervised cluster ensemble,简称SCE)模型,并对模型用变分法进行了推理求解;第二,在此基础上,给出了EM(expectation maximization)框架下的具体算法;第三,从UCI(University of California,Irvine)机器学习库中选取部分数据来做实验.实验结果表明,SCE模型本身及其变分推理后所设计的EM算法都能进行半监督聚类集成,总的来说,效果比NMFS(algorithm of nonnegative-matrix-factorization based semi-supervised)、半监督SVM(support vector machine)、LVCE(latentvariable model for cluster ensemble)等算法要好.该半监督聚类集成模型聚集了半监督学习和聚类集成两者的优点,最后的聚类结果比单纯的半监督聚类或聚类集成的效果都要好.The existing algorithms are mostly unsupervised algorithms of a cluster ensemble,which cannot take advantages of known information of datasets.As a result,the precision,robustness,and stability of a cluster ensemble are degraded.To conquer these disadvantages,a semi-supervised cluster ensemble(SCE) model,based on both semi-supervised learning and ensemble learning technologies,is designed in this paper.There are three main works in this paper.The first is that SCE is proposed,and the variational inference oriented SCE is illustrated in detail. The second is based on the above work: an EM (expectation maximization) algorithm of SCE is illustrated step by step. The third is that some datasets are drawn from the UCI (University of California, Irvine) machine learning database for experiments which show that both SCE and its EM algorithm are good for semi-supervised cluster ensemble and outperforms NMFS (algorithm of nonnegative-matrix-factorization based semi-supervised), semi-supervised SVM (support vector machine), and LVCE (latent variable model for cluster ensemble). The Semi-Supervised Cluster Ensemble model is first stated in this paper, and this paper includes the advantages of both the semi-supervise learning and the cluster ensemble. Therefore, its result is better than the results of semi-learning clustering and cluster ensemble.

关 键 词:半监督聚类集成 变分推理 必连 不连 

分 类 号:TP181[自动化与计算机技术—控制理论与控制工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象