检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:Wei Liu Guizhen Li Ling Zhou Lan Luo
机构地区:[1]School of Mathematics,Sichuan University,Chengdu 610041,China [2]Center of Statistical Research and School of Statistics,Southwestern University of Finance and Economics,Chengdu 611130,China [3]Department of Statistics and Actuarial Science,University of Iowa,Iowa City,IA 52242,USA
出 处:《Science China Mathematics》2025年第4期969-1000,共32页中国科学(数学英文版)
基 金:supported by National Key R&D Program of China(Grant No.2022YFA1003702);National Natural Science Foundation of China(Grant Nos.11931014 and 12271441)。
摘 要:Missingness in mixed-type variables is commonly encountered in a variety of areas.The requirement of complete observations necessitates data imputation when a moderate or large proportion of data is missing.However,inappropriate imputation would downgrade the performance of machine learning algorithms,leading to bad predictions and unreliable statistical inference.For high-dimensional large-scale mixed-type missing data,we develop a computationally efficient imputation method,missing value imputation via generalized factor models(MIG),under missing at random.The proposed MIG method allows missing variables to be of different types,including continuous,binary,and count variables,and are scalable to both data size n and variable dimension p while existing imputation methods rely on restrictive assumptions such as the same type of missing variables,the low dimensionality of variables,and a limited sample size.We explicitly show that the imputation error of the proposed MIG method diminishes to zero with the rate Op(max{n-1/2,p-1/2})as both n and p tend to infinity.Five real datasets demonstrate the superior empirical performance of the proposed MIG method over existing methods that the average normalized absolute imputation error is reduced by 5.3%–34.1%.
关 键 词:IMPUTATION high-dimensional mixed-type data missing at random generalized factor model
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:18.191.165.252