基于高斯Copula的约束贝叶斯网络分类器研究  被引量:10

Restricted Bayesian Network Classifier Based on Gaussian Copula

在线阅读下载全文

作  者:王双成[1,2] 高瑞[1] 杜瑞杰[1] 

机构地区:[1]上海立信会计学院数学与信息学院,上海201620 [2]上海立信会计学院立信会计研究院,上海201620

出  处:《计算机学报》2016年第8期1612-1625,共14页Chinese Journal of Computers

基  金:国家自然科学基金(61272209);上海市自然科学基金(15ZR1429700);上海市教委科研创新项目(15ZZ099)资助

摘  要:具有连续属性的分类问题普遍存在,目前主要采用两种方法来处理连续属性:一种是将连续属性进行离散化;另一种是基于高斯函数或高斯核函数来估计属性密度.连续属性的离散化可能导致信息丢失、引入噪声和类对属性的变化不够敏感等问题,而高斯函数和高斯核函数在属性密度估计中各有优势与不足,但它们具有很强的互补性.该文依据Copula和贝叶斯网络理论,结合高斯Copula密度函数、引入平滑参数的高斯核函数和以分类准确性为标准的属性父结点贪婪选择,建立连续属性约束贝叶斯网络分类器,既可以避免连续属性离散化所带来的问题,又能够实现高斯函数和高斯核函数在属性密度估计方面的优势互补.分别采用真实数据和模拟数据进行实验,结果显示,使用结合边缘高斯核函数的高斯Copula估计属性密度的约束贝叶斯网络分类器具有良好的分类准确性.Classification problems with continuous attributes are ubiquitous.At present,two methods are used to deal with continuous attributes in classifiers.One is to discretize continuous attributes,and the other is to estimate the density of attributes based on the Gaussian function or Gaussian kernel function.The discretization of continuous attributes often brings information missing,noise and less sensitivity of the class to the change of attributes.Gaussian function and Gaussian kernel function have the advantages and disadvantages in attribute density estimation.In this paper,according to the theory of Copula and Bayesian network,we develop the restricted Bayesian network classifiers with continuous attributes by combining Gaussian Copula density function,Gaussian kernel function with smoothing parameter and greedy selection of attribute parent nodes in the light of classification accuracy criteria.The problems brought by the discretization of continuous attributes can be avoided and,based on Copula,Gaussian function and Gaussian kernel function is integrated to achieve their complementary advantages in attribute density estimation.We do the experiments using real data and simulated data respectively.The results of the experiments show that the restricted Bayesian network classifier of using Gaussiancopula with marginal Gaussian kernel to estimate attribute density has very good classification accuracy.

关 键 词:贝叶斯网络分类器 连续属性 高斯Copula 高斯核函数 平滑参数 机器学习 

分 类 号:TP181[自动化与计算机技术—控制理论与控制工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象