Stratified and Un-stratified Sampling in Bagging: Data Mining  

在线阅读下载全文

作  者:Yousef M.T.El Gimati 

机构地区:[1]Statistics Department,Faculty of Science,Benghazi University,Libya

出  处:《Journal of Mathematics and System Science》2021年第1期29-36,共8页数学和系统科学(英文版)

基  金:we would like to acknowledge the Research and Consulting Centre(RCC),University of Benghazi,Libya for funded this work.

摘  要:Stratified sampling is often used in opinion polls to reduce standard errors,and it is known as variance reduction technique in sampling theory.The most common approach of resampling method is based on bootstrapping the dataset with replacement.A main purpose of this work is to investigate extensions of the resampling methods in classification problems,specifically we use decision trees,from a family of stratification models to improve prediction accuracy by aggregating classifiers built on a perturbed dataset.We use bagging,as a method of estimating a good decision boundary according to a family of stratification models.The overall conclusion is that for decision trees,un-stratified bootstrapping with bagging can yield lower error rates than other sampling strategies for simulated datasets.Based on the results in these experiments,a possible explanation as to why un-stratified sampling is a best is because bagging is itself a method of stratification.

关 键 词:BOOTSTRAPPING decision boundary stratification models RESAMPLING classifier. 

分 类 号:TP1[自动化与计算机技术—控制理论与控制工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象