基于信息值的相关属性约减——加权二分类朴素贝叶斯算法研究  被引量:9

Research on Relevant Attribute Reduction——Weighted Binary Classification Naive Bayesian Algorithm Based on Value of Information

在线阅读下载全文

作  者:杨立洪 李琼阳 李兴耀 

机构地区:[1]华南理工大学数学学院

出  处:《统计与决策》2018年第2期23-26,共4页Statistics & Decision

基  金:国家自然科学基金资助项目(11271140);广东省产学研协同创新成果转化项目(2016B090918041)

摘  要:在经典的朴素贝叶斯分类算法中,往往假设各属性之间相互独立,且对目标变量的影响程度一致,但实际问题几乎不可能满足此假设。实际应用中的二分类问题最多,在二分类问题中考虑到属性相关、样本分布不平衡、各属性影响程度的不一致性对模型性能的影响,文章提出一种基于信息值的相关属性约减—加权二分类朴素贝叶斯模型,同时在判定样本类别归属时,采用自适应学习选择合适的阈值,以此削弱不平衡样本集的影响。实证结果表明,通过引入信息值,进行相关属性的约减—加权,模型结果在准确率上较之传统朴素贝叶斯算法有极大提升。In the classic naive Bayesian classification algorithm, it often assumes that each attribute is independent and con- sistent with the degree of influence on the target variable, but in practice, it is impossible to satisfy this assumption. Most of the practical problems are associated with binary classifications, in which must be considered the influence of the relevance of attri- butes, the imbalance distribution of samples and the inconsistence of each property' s influence level on the model performance. This paper puts forward a relevant attribute reduction^eighted binary classification naive Bayesian model based on the value of information, and that at the same time in determining the home category of sample, self-adaptive learning is used to select appro- priate threshold so as to weaken the influence of unbalanced sample set. The empirical result shows that compared with the tradi- tional Na'l've Bayesian algorithm, the accuracy of the proposed model' s result has been greatly elevated through introducing the in- formation value and performing relevant attribute reduction--weighting.

关 键 词:信息值 属性约减 加权 二分类 贝叶斯算法 自适应 

分 类 号:TP391.7[自动化与计算机技术—计算机应用技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象