检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:张婧 刘妍岩[2] Jing ZHANG;Yan Yan LIU(School of Statistics and Mathematics,Zhongnan University of Economics and Law,Wuhan 430073,P.R.China;School of Mathematics and Statistics,Wuhan University,Wuhan 430072,P.R.China)
机构地区:[1]中南财经政法大学统计与数学学院,武汉430073 [2]武汉大学数学与统计学院,武汉430072
出 处:《数学学报(中文版)》2024年第3期582-598,共17页Acta Mathematica Sinica:Chinese Series
基 金:国家自然科学基金(11971362,11901581,12371274);湖北省自然科学基金(2021CFB502);中南财经政法大学中央高校基本科研业务费(2722024BY024)。
摘 要:在医学、遗传学、经济学等领域的研究中,线性回归模型常被用来研究变量间的回归关系,以进行分析和预测.而在很多实际问题中,仅仅考虑主效应的影响是远远不够的,变量之间的交互效应也会对因变量产生重要影响,同时考虑主效应和交互效应的交互模型能更全面地刻画变量之间的关系.在高维数据中,变量的个数p比较大,二阶交互项的个数(p(p+1))/2更大,此时对交互模型的统计分析存在很大的困难和挑战.如何从众多交互效应中挑选出对感兴趣事件有显著影响的重要交互效应是一个非常重要的问题.目前对此问题的研究主要集中在线性模型框架下的完全数据,本文将研究超高维右删失生存数据中重要交互效应的选取.基于距离相关系数和两步分析法的原理,本文提出了一种不依赖于任何模型假设的交互效应变量筛选方法.此方法可以同时实现重要主效应和重要交互效应的选取,且可以处理p很大的超高维数据.本文通过大量的数值模拟试验评估了该方法在有限样本下的表现,结果显示此方法能有效地处理超高维右删失数据中交互效应的选取问题.最后本文把它应用到弥漫性大b细胞淋巴瘤(DLBCL)数据的实例分析中.Linear regression models are often used to study the relationship between variables in various fields of scientific research,such as medicine,genetics,economics.However,main effects may not be sufficient to characterize the relationship between the response and predictors in complex situations,the interaction effects between variables will also have an important influence on the response variable in many practical problems.Interaction model that considers both the main effect and the interaction effect can describe the relationship between variables more comprehensively.For highdimensional data,the number of variables p is relatively large,and the number of second-order interaction terms p(p+1)/2 is much larger,the statistical analysis of the interaction model faces many difficulties and challenges.How to select the important interaction effects that have a significant impact on the event of interest from huge number of interaction effects is a very important problem.The existing research on this problem mainly focuses on the complete data under the framework of the linear model.In this paper,we will consider this problem for ultrahigh-dimensional right-censored survival data.Based on distance correlation and the two-step analysis method,we propose a model-free screening method for interaction effects which does not depend on any model assumptions.This method can select the important main effects and important interaction effects at the same time,and can handle ultrahigh-dimensional data with large p.Extensive simulation studies are carried out to evaluate the finite sample performance of the proposed procedure,and the results show that this method can effectively select the important interaction effects for ultrahigh-dimensional rightcensored survival data.As an illustration,we apply the proposed method to analyze the diffuse large-B-cell lymphoma(DLBCL)data.
关 键 词:交互效应 超高维生存数据 距离相关系数 两步分析法 变量筛选
分 类 号:O212.1[理学—概率论与数理统计]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.7