Model-Free Feature Screening Based on Gini Impurity for Ultrahigh-Dimensional Multiclass Classification  

Model-Free Feature Screening Based on Gini Impurity for Ultrahigh-Dimensional Multiclass Classification

在线阅读下载全文

作  者:Zhongzheng Wang Guangming Deng Zhongzheng Wang;Guangming Deng(College of Science, Guilin University of Technology, Guilin, China;Applied Statistics Institute, Guilin University of Technology, Guilin, China)

机构地区:[1]College of Science, Guilin University of Technology, Guilin, China [2]Applied Statistics Institute, Guilin University of Technology, Guilin, China

出  处:《Open Journal of Statistics》2022年第5期711-732,共22页统计学期刊(英文)

摘  要:It is quite common that both categorical and continuous covariates appear in the data. But, most feature screening methods for ultrahigh-dimensional classification assume the covariates are continuous. And applicable feature screening method is very limited;to handle this non-trivial situation, we propose a model-free feature screening for ultrahigh-dimensional multi-classification with both categorical and continuous covariates. The proposed feature screening method will be based on Gini impurity to evaluate the prediction power of covariates. Under certain regularity conditions, it is proved that the proposed screening procedure possesses the sure screening property and ranking consistency properties. We demonstrate the finite sample performance of the proposed procedure by simulation studies and illustrate using real data analysis.It is quite common that both categorical and continuous covariates appear in the data. But, most feature screening methods for ultrahigh-dimensional classification assume the covariates are continuous. And applicable feature screening method is very limited;to handle this non-trivial situation, we propose a model-free feature screening for ultrahigh-dimensional multi-classification with both categorical and continuous covariates. The proposed feature screening method will be based on Gini impurity to evaluate the prediction power of covariates. Under certain regularity conditions, it is proved that the proposed screening procedure possesses the sure screening property and ranking consistency properties. We demonstrate the finite sample performance of the proposed procedure by simulation studies and illustrate using real data analysis.

关 键 词:Ultrahigh-Dimensional Feature Screening MODEL-FREE Gini Impurity Multiclass Classification 

分 类 号:O17[理学—数学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象