检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:石福艳[1] 马洁 黄璐 许小珊 孙娜[1] 孟维静[1] 王素珍[1] 杨丽平[2] SHI Fu-yan;MA Jie;HUANG Lu(School of Public Health and Management,Weifang Medical University,Weifang,Shandong Province 201053,China)
机构地区:[1]山东省潍坊市潍坊医学院卫生统计学教研室,山东潍坊261053 [2]陕西省西安市第四军医大学西京医院健康医学中心
出 处:《中国公共卫生》2019年第11期1536-1539,共4页Chinese Journal of Public Health
基 金:国家自然科学基金(81473071);陕西省科技统筹创新工程计划项目(2016KTZDSF02–07–01);山东省科技发展计划项目(2015WS0067);潍坊医学院博士启动基金(2017BSQD51)
摘 要:目的研究基于bootstrap抽样的期望最大化算法(EMB)的多重填补方法在横断面健康体检定量变量缺失数据的填补效果,为健康体检数据选择恰当的多重填补方法提供相关依据。方法基于人群横断面健康体检实测数据,采用EMB法多重填补法,应用R 3.5.0统计软件中的Amelia II程序包对2013年1-12月在陕西省西安市西京医院健康体检中心进行常规体检的1 634名员工的健康体检数据进行多重填补分析。结果对于横断面定量健康体检资料,在单变量缺失率分别为<10%、20%和70%3种随机缺失情况下,EMB多重填补法相对于列表删除法其估计误差均降低;基于相同数据,EMB多重填补次数不同,资料的填补效果不同,本研究资料较为合适的填补次数为m=10次;填补前后概率密度曲线分布图显示,填补次数m=10时多重填补值与实际观察值的概率密度曲线图吻合程度较好;变量过拟合诊断图进一步显示,填补次数m=10时各变量大多数观测值的90%CI包含了其最佳拟合线,且其可信区间较窄;基于列表删除法和EMB多重填补法处理后的2个不同分析数据集分别构建的多因素回归模型中包含的变量不同。结论对于不同缺失率随机缺失的定量变量,EMB多重填补法的填补效果均优于列表删除法;不同缺失资料的最优填补次数不同。Objective To evaluate the effect of expectation maximization with bootstrapping(EMB) in multiple imputation of quantitative variables for cross-sectional health examination data and to provide evidences for choosing appropriate multiple imputation method for health examination data. Methods We collected data on 1 634 people taking routine physical examination at Xijing Hospital Health Checkup Center in Xi′an, Shaanxi province from January to December2013. The data were analyzed with Amelia II package in R 3.5.0 statistical software and EMB multiple imputation method was used to fill missing values in the data set. Results The estimated errors of the multiple imputations with EMB were decreased compared to those with list deletion method for the data set with the missing rate of less than 10%, 20%, or 70%for univariate quantitative variables. The effect of the EMB multiple imputation differed by the time of the imputation process and the appropriate imputation time for the used data set was 10. The probability density distribution curves for the data set before and after the imputation demonstrated that the imputed values were in a good agreement with the observed values when 10 imputations completed;the over-fitting diagnostic plot further revealed that the majority of the 90%confidence intervals for most observations of each variable contained the best fit line, with the narrow ranges for the confidence intervals. Different variables were included in the multivariate logistic regression models constructed for the same data set processed with multiple imputation with list deletion and the EMB method. Conclusion For quantitative variables with different random missing rate, the effect of EMB based multiple imputation is better than that of list deletion method and the optimal imputation times vary for data sets with different missing profile.
关 键 词:健康体检 缺失数据 基于bootstrap抽样的期望最大化算法(EMB) 多重填补
分 类 号:R195.1[医药卫生—卫生统计学]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.28