鲁棒贝叶斯混合分布的模型选择  被引量:1

Model selection for robust Bayesian mixture distributions

在线阅读下载全文

作  者:卿湘运[1] 王行愚[1] 

机构地区:[1]华东理工大学信息科学与工程学院,上海200237

出  处:《南京大学学报(自然科学版)》2009年第5期689-698,共10页Journal of Nanjing University(Natural Science)

基  金:国家自然科学基金(60674089);上海市重点学科基金(B504)

摘  要:提出一种基于偏差信息准则(deriance information criterion,DIC)的鲁棒贝叶斯混合分布模型选择算法.在变分逼近框架下,给出鲁棒贝叶斯混合模型的DIC计算公式;设计的模型选择算法能同时学习模型参数推断和进行模型选择,避免在大的候选模型集中根据模型选择准则选取最优模型.给出试验参数初始值设置方法,在含有较多离群点的仿真数据和Old Faithful Geyser数据上的试验结果表明了好的性能:得到鲁棒的混合分量参数和较准确的混合分量个数.Bayesian approaches to robust mixture modelling based on Student-t distributions enable to be less sensitive to outliers, thereby preventing from over-estimating of the number of mixting components. However, there are two intractable problems in the previous methods for model selection under the variational Bayesian framework:(1) The variational approach converges to a local maximum of the low bound on the log-evidence that dependents on the initial parameter values. How can the variational approach guarantee that the initial settings for different models are consistency? (2) The low bound is sensitive to factorized approximation forms in the inference process. How can the variational approach guarantee that the approximate errors for different models are equivalent? In this paper, we present a model selection algorithm for robust bayesian mixture distributions based on deviance information criterion(D/C) proposed by Spiegelhalter et al. in 2002. Unlike the Bayesian Infromation Criterion (BIC), the DIC is straightforward in calculation, which has been adopted in many modern applications. Inspired by the works of MeGrory et al. , which used the DIC values for model selection tasks of finite mixture Gaussian distributions and hidden Markov models, the calculation of a DIC for robust Bayesian mixture model is derived. The proposed algorithm can learn model parameters and perform model selection simultaneously, which avoids choosing an optimum one among a large set of candidate models. A method to initialize parameters of the algorithm is provided. Experimental results on simulated data and Old Faithful Geyser data containing a large amount of outliers show the good performance that the algorithm can learn parameters of mixture components robustly and the number of components precisely.

关 键 词:混合模型 变分学习 偏差信息准则 模型选择   

分 类 号:TP393[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象