检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:白永昕 钱曼玲 田茂再[3,4,5] Yong Xin BAI;Man Ling QIAN;Mao Zai TIAN(School of Science,Beijing Information Science and Technology University,Beijing 100192,P.R.China;School of Mathematics and Statistics,The University of Melbourne,Melbourne,VIC 3010 Australia;Center for Applied Statistics,School of Statistics,Renmin University of China,Beijing 100872,P.R.China;Center for Social and Economic Statistics,School of Statistics and Information,Xinjiang University of Finance and Economics,Urumgi 830012,P.R.China;School of Medical Engineering and Technology,Xinjiang Medical University,Urumqi 830063,P.R.China)
机构地区:[1]北京信息科技大学理学院,北京100192 [2]墨尔本大学数学与统计学院,澳大利亚墨尔本3010 [3]中国人民大学应用统计科学研究中心、中国人民大学统计学院,北京100872 [4]新疆财经大学新疆社会经济统计研究中心、新疆财经大学统计与信息学院,乌鲁木齐830012 [5]新疆医科大学医学工程技术学院,乌鲁木齐830063
出 处:《数学学报(中文版)》2024年第3期444-467,共24页Acta Mathematica Sinica:Chinese Series
基 金:中国人民大学科学研究基金(中央高校基本科研业务费专项资金资助)项目成果“当代复杂大数据的动态稳健建模及应用研究”(22XNL016);北京信息科技大学项目成果“复杂超高维数据下分位回归模型的变量筛选和变量选择”(2022XJJ31)
摘 要:针对存在缺失数据的超高维可加分位回归模型,本文提出一种有效的变量筛选方法.具体而言,将典型相关分析的思想引入到最优变换的最大相关系数,通过协变量和模型残差最优变换后的最大相关系数重要变量的边际贡献进行排序,从而进行变量筛选.然后,在筛选的基础上,利用稀疏光滑惩罚进一步做变量选择.所提变量筛选方法有三点优势:(1)基于最优变换的最大相关可以更全面的反映响应变量对协变量的非线性依赖结构;(2)在迭代过程中利用残差可以获取模型的相关信息,从而提高变量筛选的准确度;(3)变量筛选过程和模型估计分开,可以避免对冗余协变量的回归.在适当的条件下,证明了变量筛选方法的确定性独立筛选性质以及稀疏光滑惩罚下估计量的稀疏性和相合性.同时,通过蒙特卡罗模拟给出了所提方法的表现并通过一组小鼠基因数据说明了所提方法的有效性.We propose an effective iterative screening method for the ultra-high dimensional additive quantile regression with missing data.Specifically,the canonical correlation analysis is introduced into the maximum correlation coefficient based on the optimal transformation,and the marginal contribution of important variables is sorted by the maximum correlation coefficient after the optimal transformation of covariates and model residuals.On the basis of variable screening,the sparse smooth penalty is used to make further variable selection.The proposed variable selection method has three advantages:(1)The maximum correlation based on optimal transformation can reflect the nonlinear dependent structure of response variable to covariable more comprehensively;(2)In the iteration process,the residual can be used to obtain the relevant information of the model so as to improve the accuracy of variable screening;(3)The variable screening process can be separated from model estimation to avoid regression of redundant covariables.Under appropriate conditions,the sure independent screening property of the variable screening method and the sparsity and consistency of the estimator under the sparse-smooth penalty are proved.Finally,the performance of the proposed method is given by Monte Carlo simulation and the rat genome data is used to illustrate the effectiveness of the proposed method.
分 类 号:O212.1[理学—概率论与数理统计]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.118