带有治愈亚组的区间删失数据的变量选择方法研究  

Variable Selection for Interval-Censored Data with a Cured Subgroup

在线阅读下载全文

作  者:蔡敏 方李君 李洪喜 李树威 CAI Min;FANG Li-jun;LI Hong-xi;LI Shu-wei(School of Economics and Statistics,Guangzhou University,Guangzhou 510006,China)

机构地区:[1]广州大学经济与统计学院,广东广州510006

出  处:《数理统计与管理》2023年第2期267-277,共11页Journal of Applied Statistics and Management

基  金:国家自然科学基金(11901128);全国统计科学研究项目(2022LY041);广东省自然科学基金项目(2018A030310068,2022A1515011901);广州市基础研究计划市校(院)联合资助项目(202102010512)。

摘  要:带有治愈亚组的区间删失数据常见于周期性随访或检查的医学研究中,此时研究总体中有一部分个体不会发生所感兴趣的事件,而对于每个发生所感兴趣事件的个体,其事件的发生时间落入某一时间区间内而非被精确地观测到.此外,在实际问题中,我们时常会遇到协变量维数较高的情形,而如何进行变量选择以识别出对疾病发生有重要影响的因素十分重要.本文研究带有治愈亚组的区间删失数据的变量选择问题,我们采用最小近似信息准则方法并提出一种惩罚期望极大化算法来同时实现变量选择和参数估计,所提出方法的一个重要优点是在变量选择过程中无须选择最优调节参数.通过数值模拟,我们比较所提出方法与一般的正则化方法如LASSO,ALASSO,以及SCAD在有限样本下的表现.结果表明,所提出方法有很高的变量选择准确率且在计算上比LASSO,ALASSO和SCAD更加快速、高效.最后,我们将所提出方法应用到一组有关于尼日利亚新生儿童死亡率的区间删失数据中.Interval-censored data with a cured subgroup are frequently encountered in many medical studies with periodical follow-up or examination.In this situation,some subjects in study population never experience the event of interest,and among the subjects in the susceptible subgroup,each event time falls into a certain time interval rather than being observed exactly.Furthermore,one may confront a large number of covariates in practice,and how to identify the factors that have important influence on the occurrence of disease becomes extremely important.In this paper,we discuss variable selection problem for interval-censored data with a cured subgroup.We adopt minimum approximated information criterion,and develop a penalized EM algorithm to select the important variables and estimate the parameters simultaneously.One important merit of the proposed method is that we do not need to select the optimal tuning parameter in the variable selection procedure.Through the simulation study,we compare the proposed method with the commonly used regularization methods,such as LASSO,ALASSO and SCAD,and the obtained results show that the proposed method performs well in terms of variable selection accuracy and is much faster than the LASSO,ALASSO and SCAD from the computational aspect.Finally,we apply the proposed approach to a set of interval-censored data from the national children mortality in Nigeria.

关 键 词:失效时间 区间删失 变量选择 BIC准则 非混合治愈率模型 

分 类 号:O212[理学—概率论与数理统计]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象