检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:白永昕 钱曼玲 田茂再[3,4,5] Bai Yongxin;Qian Manling;Tian Maozai(School of Science,Beijing Information Science and Technology University,Beijing 100192,China;School of Mathematics and Statistics,The University of Melbourne,Melbourne 3010,Australia;Center for Applied Statistics of Renmin University of China,Beijing 100872,China;School of Statistics and Information,Xinjiang University of Finance and Economics,Urumqi 830012,China;School of Mathematics and Data Science,Changji University,Changji Hunan 831100,China)
机构地区:[1]北京信息科技大学理学院,北京100192 [2]墨尔本大学数学与统计学院,澳大利亚墨尔本3010 [3]中国人民大学应用统计科学研究中心,北京100872 [4]新疆财经大学统计与信息学院,乌鲁木齐830012 [5]昌吉大学数学与数据科学学院,湖南昌吉831100
出 处:《统计与决策》2024年第9期43-48,共6页Statistics & Decision
基 金:北京市自然科学基金资助项目(1242005);北京信息科技大学校科研基金资助项目(2022XJJ31)。
摘 要:在超高维数据中,一方面,协变量的维数可能远远大于样本量,甚至随着样本量以指数级的速度增长;另一方面,超高维数据通常是异质的,协变量对条件分布中心的影响可能与他们对尾部的影响大不相同,甚至会出现重尾以及异常点的复杂情况。文章在协变量维度发散且为超高维的情况下研究了部分线性可加分位数回归模型的变量选择和稳健估计问题。首先,为了实现模型的稀疏性和非参数光滑性,引入了一种非凸Atan双惩罚,并采用分位迭代坐标下降算法来解决所提方法的优化问题。在选择适当正则化参数的情况下,证明了所提双惩罚估计量的理论性质。其次,通过模拟研究对所提方法的性能进行验证。模拟结果表明,所提方法比其他惩罚方法具有更好的表现,尤其是在数据存在重尾的情况下。最后,通过基于癌症筛查病人血液样本数据的实证来验证所提方法的实用性。In ultrahigh dimensional data,on the one hand,the dimensionality of covariates may be much larger than the sample size,even growing exponentially with the sample size;on the other hand,ultrahigh dimensional data are typically heterogeneous,where the influence of covariates on the center of the conditional distribution may differ greatly from their influence on the tails,leading to complex situations such as heavy tails and outliers.This paper investigates variable selection and robust estimation of partial linear additive quantile regression models under the condition of divergence of covariate dimension and ultrahigh dimension.Firstly,in order to achieve model sparsity and nonparametric smoothness,a non-convex Atan double penalty is introduced,and the proposed optimization problem is solved by using a quantile iterative coordinate descent algorithm;the theoretical properties of the proposed double penalty estimator are demonstrated under the selection of appropriate regularization parameters.Subsequently,the performance of the proposed method is verified through simulation studies.The simulations results indicate that the proposed method outperforms other penalty methods,especially in the case of data with heavy tails.Finally,the practicality of the proposed method is verified through empirical analysis of blood sample data from cancer screening patients.
关 键 词:超高维数据 分位数回归 部分线性可加 变量选择 Atan双惩罚
分 类 号:O212.4[理学—概率论与数理统计]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.118