二分类变量缺失数据处理方法的比较研究  

Comparative Study on Methods for Handling Missing Data in Binary Variables

在线阅读下载全文

作  者:余雪勤 

机构地区:[1]重庆理工大学理学院,重庆

出  处:《统计学与应用》2023年第5期1376-1384,共9页Statistical and Application

摘  要:本文介绍了随机缺失模式下一些常用的插补方法,着重介绍了多重插补法和回归插补法两种方法,并且通过模拟实际案例中的响应变量不同的缺失率进一步探讨了这几种方法的插补效果。结果表明,在缺失率较低的情况下,基于逻辑回归的多重插补与回归插补效果差别不大,但基于逻辑回归的多重插补下,插补1次和插补5次后的模型个别参数系数及标准误与完整数据系数差别较大;然而在缺失率较大的情况下,基于逻辑回归的多重插补的效率明显低于回归插补,插补1次的效果与插补5次的效果差别不大,插补后参数系数及标准误与完整数据系数差别大。This article introduces some commonly used imputation methods for random missing patterns, with a focus on two methods: multiple imputation and regression imputation. It further explores the imputation effectiveness of these methods by simulating different missing rates for the response variable in real-life cases. The results show that, at lower missing rates, there is not much difference in the effectiveness between multiple imputation based on logistic regression and regression imputation. However, under multiple imputation based on logistic regression, the estimated coefficients and standard errors of the model after 1 or 5 imputations differ significantly from those of the complete data set. On the other hand, at higher missing rates, multiple imputation based on logistic regression is noticeably less efficient than regression imputation. The effectiveness does not differ much between 1 and 5 imputations, but the estimated coefficients and standard errors after imputation differ greatly from those of the complete data set.

关 键 词:二分类变量 随机缺失 回归插补 多重插补 

分 类 号:G63[文化科学—教育学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象