成组删除法和多重填补法对随机缺失的二分类变量资料处理效果的比较  被引量:5

Comparison of deletion in group method and multiple imputation in dealing with missing at random binary variable data

在线阅读下载全文

作  者:王曼[1] 施念[2] 花琳琳[3] 杨永利[4] 

机构地区:[1]郑州大学学报编辑部,郑州450001 [2]郑州大学临床医学系,郑州450001 [3]郑州大学第二附属医院科研外事办公室,郑州450014 [4]郑州大学公共卫生学院卫生统计学教研室,郑州450001

出  处:《郑州大学学报(医学版)》2012年第5期642-645,共4页Journal of Zhengzhou University(Medical Sciences)

基  金:"十.五"国家科技攻关计划项目2004BA719A13

摘  要:目的:评价两种随机缺失的二分类变量资料处理方法。方法:以艾滋病中医症候的调查资料为数据来源,利用SAS9.2对完整数据集随机模拟,构建不同比例的随机缺失数据集,对缺失数据集采用多重填补法中的lo-gistic回归法(MI/logistic)进行填充处理;同时对缺失数据集采用成组删除法进行处理;根据各个数据集建立logis-tic回归模型,与完整数据集进行比较。结果:缺失10%时,成组删除法处理结果与完整数据集更接近;缺失20%~40%时,MI/logistic填补后常数项和x的回归系数明显偏离完整数据集;缺失50%时,MI/logistic填充2次时x的回归系数和标准误更接近于完整数据集;缺失60%时,MI/logistic填充后x的回归系数严重偏离完整数据集,成组删除后x回归系数的标准误明显偏离完整数据集。结论:缺失较少(缺失率<40%)时,成组删除法处理效果较好;缺失50%时,采用MI/logistic回归法填充效果更好;缺失60%以上时,两处理方法均不理想。Aim:To explore two kinds of method for dealing with binary variable data with missing value at random. Methods:Data from TCM syndrome of HIV/AIDS were used.SAS 9.2 was used to create missing dataset in different ratio at random,and the missing values were handled by the method of multiple imputation (MI) or by the method of deletion in group,and then logistic regression model was established based on the missing datasets,and the results were compared with the original whole dataset.Results:When missing ratio was less than 10%,deletion in group method were close to the whole dataset.When missing ratio was 20%~40%, the regression coefficient and standard error of MI/logistic method deviated from the whole dataset statistically.When missing ratio was 50%, the regression coefficient and standard error of MI/logistic method with 2 times imputation approached the whole dataset. When missing rate was about 60%, the regression coefficient of MI/logistic method and the standard error of deletion in group method deviated from the whole dataset.Conclusion:Deletion in group method is appropriate for handling binary variable data with less than 40% missing ratio at random,MI method is more appropriate when missing ratio was 50%;when missing ratio is above 60%,both of the 2 methods are not ideal.

关 键 词:二分类变量 缺失值 成组删除法 多重填补法 

分 类 号:R195.1[医药卫生—卫生统计学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象