检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:花琳琳[1] 施念[2] 杨永利[1] 赵天仪 施学忠[1]
机构地区:[1]郑州大学公共卫生学院卫生统计学教研室,郑州450001 [2]郑州大学基础医学院,郑州450001 [3]上海交通大学医学院,上海200025
出 处:《郑州大学学报(医学版)》2012年第3期315-318,共4页Journal of Zhengzhou University(Medical Sciences)
基 金:“十·五”国家科技攻关计划基金资助项目2004BA719A13-6
摘 要:目的:比较不同的缺失值处理方法处理随机缺失数据的效果。方法:以HIV/AIDS血液样本血红蛋白、白细胞和血尿素氮检测数据为基础,利用SAS9.1,分别模拟完整数据集和不同缺失率的数据集,从精确度、准确度和分布三方面比较不同方法对缺失数据集的处理效果。结果:任意缺失比例下血红蛋白和白细胞数据经不同的方法处理后与完整数据集比较差异无统计学意义。不同缺失比例下,多重填补(MI)法的精确度最高。缺失率10%~20%时,MI法填充后的准确度最高。缺失率30%时,成组删除法处理后的准确度最高。缺失40%以上时,准确度填充效果不稳定。不同缺失比例下,回归法、成组删除法和MI填充2次后的数据的分布特征与完整数据集一致。结论:数据缺失10%~20%时,MI法填充效果最好;缺失30%时,成组删除法处理效果最好;缺失40%以上时,所有方法填充效果均不佳。Aim:To compare the results of different methods in dealing with missing values of missing at random.Methods:SAS 9.1 was used to simulate complete data and missing data with different missing rate from HIV/AIDS blood specimen data.The results of different methods were compared from distribution characteristic,accuracy and precision.Results:The variables of hemoglobin and white blood cells had no significant difference among the results of different methods.The multiple-imputation(MI) method had best precision.When missing rate was between 10% and 20%,MI method had better accuracy than the others.When missing rate was about 30%,deleting in groups had better accuracy than the others.When missing rate was above 40%,any methods had bad accuracy.Compared with other methods,regression,deleting in groups and MI had better distribution characteristic.Conclusion:When missing rate is between 10% and 20%,MI is more suitable than others.When missing rate is about 30%,deleting in groups is more appropriate.When missing rate is above 40%,the effect of all methods is poor.
分 类 号:R195.1[医药卫生—卫生统计学]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:3.140.192.22