惩罚logistic回归方法在SNPs数据变量筛选研究中的应用  被引量:4

Application of penalized logistic regression methods to the variable selection of SNPs data

在线阅读下载全文

作  者:刘匆提 李昂[1] 门志红 姜博[1] 肖纯[1] 刘艳[1] 李贞子[1] 

机构地区:[1]哈尔滨医科大学卫生统计学教研室,黑龙江哈尔滨150081

出  处:《实用预防医学》2016年第11期1395-1399,共5页Practical Preventive Medicine

基  金:国家自然科学基金(81172741;81302511)

摘  要:目的比较L1正则化、L2正则化和弹性网三种惩罚logistic回归对SNPs数据的变量筛选能力。方法根据所设置的参数生成不同条件的SNPs仿真数据,利用正确率、错误率和正确指数从三个方面评价三种惩罚logistic回归的变量筛选能力。结果正确率表现为L2正则化惩罚logistic回归>弹性网惩罚logistic回归>L1正则化惩罚logistic回归;错误率表现为L2正则化惩罚logistic回归>弹性网惩罚logistic回归>L1正则化惩罚logistic回归;正确指数则表现为弹性网惩罚logistic回归>L1正则化惩罚logistic回归>L2正则化惩罚logistic回归。结论综合来看弹性网的筛选能力更优,弹性网融合L1、L2两种正则化的思想,在高维数据分析中既能保证模型的稀疏性,便于结果的解释,又解决了具有相关性自变量不能同时进入模型的问题。Objective To compare the abilities of 3 kinds of penalized logistic regression methods including( L1 regularization, L2 regularization and elastic net) in the variable selection of single nucleotide polymorphisms (SNPs ) data. Methods We generated the simulated SNPs data in different conditions according to the setup parameters, and then assessed the abilities of 3 penalized logistic regression methods in the variable selection from the 3 aspects of accuracy rate, error rate and correct index. Resuits The accuracy and error rates of the 3 penalized logistic regression methods showed as follows : L2 regularization〉elastic net 〉L1 regularization and L2 regularization 〉elastic net〉L1 regularization. The correct indexes of the 3 penalized logistic regression methods showed as follows: elastic net〉L1 regularization 〉L2 regularization. Conclusions Elastic net is the best among the 3 methods in terms of variable selection, which combines the ideas of both L1 and L2 regularization. In high-dimensional data analy- sis, this method not only guarantees the sparsity of the model which thus facilitates the interpretation of the results, but also solves the problem that the correlated dependent variables can not simultaneously enter the model.

关 键 词:惩罚logistic回归 L1正则化 L2正则化 弹性网 正确指数 

分 类 号:R195.1[医药卫生—卫生统计学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象