基于条件分布的超高维特征筛选  被引量:1

Ultrahigh Dimensional Feature Screening Based on Conditional Distribution

在线阅读下载全文

作  者:来鹏 沈宝华 宋凤丽[1] LAI Peng;SHEN Bao-hua;SONG Feng-li(School of Mathematics & Statistics, Nanjing University of Information Science & Technology, Nanjing 210044, China)

机构地区:[1]南京信息工程大学数学与统计学院

出  处:《数学的实践与认识》2018年第9期154-162,共9页Mathematics in Practice and Theory

基  金:国家自然科学基金(11771215);国家社会科学基金重大项目(16ZDA047,17ZDA092);江苏省自然科学基金(BK20161530,BK20140983);江苏省“青蓝工程”项目(2016)

摘  要:特征筛选方法是处理超高维数据的一种快速有效的降维方法.针对超高维判别分类数据,提出一种改进的超高维特征筛选方法,方法不需要特定的模型假定;可以处理多分类响应变量情形;可适用于离散型或连续型协变量情形;对服从重尾分布的协变量,方法仍具有较好的稳健性.从理论上证明了所提出特征筛选方法满足确定筛选性和指标排序相合性,并通过数值模拟和实例分析在有限样本条件下验证了方法的有效性.Feature screening is a fast and effective dimensionality reduction method for the ultrahigh-dimensional data. For ultrahigh-dimensional discriminant classification data, an improved ultrahigh-dimensional feature screening method is proposed in this paper. The proposed procedure does not require a specification on the model structure. It can handle the case where the response variable is multi-class. It is applicable to categorical and contiuuous covariates. The method is robust to heavy-tailed distribution of predictors. This paper proves theoretically that the proposed feature screening method satisfies the sure screening property and ranking consistency property. Numerical simulation and a real data application under the finite sample are conducted to evaluate the performance of the proposed method.

关 键 词:特征筛选 条件分布函数 确定筛选性质 排序相合性 

分 类 号:O212[理学—概率论与数理统计]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象