Tournament screening cum EBIC for feature selection with high-dimensional feature spaces  

Tournament screening cum EBIC for feature selection with high-dimensional feature spaces

在线阅读下载全文

作  者:CHEN ZeHua CHEN JiaHua 

机构地区:[1]Department of Statistics&Applied Probability,National University of Singapore,3 Science Drive 2,117543,Singapore [2]Department of Statistics,University of British Columbia,Vancouver,BC,V6T 1Z2,Canada

出  处:《Science China Mathematics》2009年第6期1327-1341,共15页中国科学:数学(英文版)

基  金:supported by Singapore Ministry of Educations ACRF Tier 1 (Grant No. R-155-000-065-112);supported by the National Science and Engineering Research Countil of Canada and MITACS,Canada

摘  要:The feature selection characterized by relatively small sample size and extremely high-dimensional feature space is common in many areas of contemporary statistics. The high dimensionality of the feature space causes serious difficulties: (i) the sample correlations between features become high even if the features are stochastically independent; (ii) the computation becomes intractable. These difficulties make conventional approaches either inapplicable or inefficient. The reduction of dimensionality of the feature space followed by low dimensional approaches appears the only feasible way to tackle the problem. Along this line, we develop in this article a tournament screening cum EBIC approach for feature selection with high dimensional feature space. The procedure of tournament screening mimics that of a tournament. It is shown theoretically that the tournament screening has the sure screening property, a necessary property which should be satisfied by any valid screening procedure. It is demonstrated by numerical studies that the tournament screening cum EBIC approach enjoys desirable properties such as having higher positive selection rate and lower false discovery rate than other approaches.The feature selection characterized by relatively small sample size and extremely high-dimensional feature space is common in many areas of contemporary statistics.The high dimensionality of the feature space causes serious diffculties:(i) the sample correlations between features become high even if the features are stochastically independent;(ii) the computation becomes intractable.These diffculties make conventional approaches either inapplicable or ine?cient.The reduction of dimensionality of the feature space followed by low dimensional approaches appears the only feasible way to tackle the problem.Along this line,we develop in this article a tournament screening cum EBIC approach for feature selection with high dimensional feature space.The procedure of tournament screening mimics that of a tournament.It is shown theoretically that the tournament screening has the sure screening property,a necessary property which should be satisfied by any valid screening procedure.It is demonstrated by numerical studies that the tournament screening cum EBIC approach enjoys desirable properties such as having higher positive selection rate and lower false discovery rate than other approaches.

关 键 词:extended Bayes information criterion feature selection penalized likelihood reduction of dimensionality small-n-large-P sure screening 62F07 62P10 

分 类 号:O212[理学—概率论与数理统计]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象