检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:张韵祺 张春明 唐年胜[1] ZHANG YUNQI;ZHANG CHUNMING;TANG NIANSHENG(Yunnan Key Laboratory of Statistical Modeling and Data Analysis,Yunnan University,Kunming 650091,China;Department of Statistics,University of Wisconsin-Madison,Madison 53705,USA)
机构地区:[1]云南大学云南省统计建模与数据分析实验室,昆明650091 [2]威斯康星大学麦迪逊分校统计系,美国麦迪逊53705
出 处:《应用数学学报》2022年第1期31-46,共16页Acta Mathematicae Applicatae Sinica
基 金:国家自然科学基金(11690014);美国自然科学基金(DMS-1712418);威斯康星校友研究基金资助。
摘 要:本文针对带有组结构的广义线性稀疏模型,引入布雷格曼散度作为一般性的损失函数,进行参数估计和变量选择,使得该方法不局限于特定模型或特定的损失函数.本文比较研究了Ridge,SACD,Lasso,自适应Lasso,组Lasso,分层Lasso,自适应分层Lasso和稀疏组Lasso共8种惩罚函数的特点和引入模型后参数估计和变量选择的方法,并给出了分层Lasso的坐标轴下降算法和稀疏组Lasso的加速全梯度更新算法.模拟研究验证了组Lasso,分层Lasso,自适应分层Lasso和稀疏组Lasso能更好的利用数据的组结构信息,自适应分层Lasso和稀疏组Lasso在变量选择准确性,参数估计精度方面优于其它方法,稀疏组Lasso在模型预测精度上达到最优.作为实证研究,本文将带有稀疏组Lasso惩罚的逻辑斯蒂模型应用于骨关节炎患者的外周血单核细胞基因表达水平的分析,选出了9个基因集中共136个基因与骨关节炎有关,以期对后续生物医学研究有一定指导价值.We introduce the Bregman divergence as a general loss function for the generalized linear sparse model with group structures so that the parameter estimation and variable selection methods are not limited to a specific model or a specific loss function.We compare the characteristics of eight kinds of penalty functions,such as Ridge,SACD,Lasso,Adaptive Lasso,Group Lasso,Hierarchical Lasso,Adaptive Hierarchical Lasso and Sparse Group Lasso,and the methods of parameter estimation and variable selection with these penalties.The Coordinate Descent algorithm for Hierarchical Lasso and the Accelerated Full Gradient Update algorithm for Sparse Group Lasso are also detailed.The simulation study shows that the Group Lasso,Hierarchical Lasso,Adaptive Hierarchical Lasso,and Sparse Group Lasso can better utilize the group structure information of the data,Adaptive Hierarchical Lasso and Sparse Group Lasso in terms of variable selection accuracy and parameter estimation accuracy.Compared with other methods,the Sparse Group Lasso is optimal in model prediction accuracy.As an empirical example,we apply a logistic model with Sparse Group Lasso penalty to the analysis of gene expression levels in peripheral blood mononuclear cells of patients with osteoarthritis and selected 136 genes in 9 gene sets which affect osteoarthritis,in order to have a certain guiding value for the follow-up biomedical research.
关 键 词:Lasso 布雷格曼散度 组结构 广义线性模型 稀疏模型
分 类 号:O212.1[理学—概率论与数理统计]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.22.63.154