几种纵向缺失数据填补方法的比较及在阿尔茨海默病随访数据中的应用  被引量:3

Comparison of several imputation methods in longitudinal missing data and the application to follow-up data of Alzheimer's disease

在线阅读下载全文

作  者:韩红娟[1] 葛晓燕[1] 刘龙[1] 杨林[1] 余红梅[1] HAN Hong-juan;GE Xiao-yan;LIU Long;YANG Lin;YU Hong-mei(Department of Health Statistics,Shanxi Medical University,Taiyuan,Shanxi 030001,China)

机构地区:[1]山西医科大学卫生统计学教研室,山西太原030001

出  处:《现代预防医学》2018年第22期4033-4037,4125,共6页Modern Preventive Medicine

基  金:基于认知功能多维测量和潜在结构的个体化痴呆风险动态预测模型研究;国家自然科学基金资助项目(81673277)

摘  要:目的针对纵向缺失数据,比较几种适用的填补方法并从中选择最佳方法用于阿尔茨海默病随访资料的数据缺失填补。方法针对随机缺失机制且缺失变量为连续变量的纵向缺失资料,模拟缺失比例分别为10%、20%、30%、40%和50%的随机数据集,结合末次观察值结转(Last Observation Carried Forward, LOCF)填补方法、马尔可夫链蒙特卡罗填补法(Markov Chain Monte Carlo, MCMC)、全条件定义法(Fully Conditional Specification, FCS)进行填补,采用无偏性和有效性评价指标,比较填补效果,选取最理想的填补方法,并将该方法应用于阿尔茨海默病随访研究中收缩压和蒙特利尔认知评估量表(Montreal Cognitive Assessment, Mo CA)得分的填补。结果 (1)纵向缺失资料中若不考虑时间变量,在处理几个连续性的缺失变量时,MCMC法在各缺失率下填补均优势明显,LOCF填补法在缺失率较低时具有一定的效果,且方法简单,而FCS法的填补结果均不太好。当数据缺失比较严重,缺失率高于40%时,各种填补方法的填补结果均不佳。(2)将MCMC法用于填补阿尔茨海默病的随访缺失数据,当填补次数为3时,收缩压和Mo CA得分两指标的填补效果最佳。结论为了得到最理想的结果,在处理缺失数据时填补方法和适当的填补次数都需要考虑。Objective To compare several imputation methods for the longitudinal missing data, and to select the best method to fill the missing data in follow-up data of Alzheimer's disease (AD). Methods For longitudinal missing data with missing continuous variables and random missing mechanism, random data sets with different missing proportions 10%, 20%, 30%, 40% and 50% were simulated. The last observation carried forward (LOCF), Markov Chain Monte Carlo method (MCMC) and fully conditional specification (FCS) were adopted. Based on criteria of unbiasedness and effectiveness, the most ideal imputation method was selected and applied to imputation of systolic blood pressure and Montreal Cognitive Assessment (MoCA) in the follow-up data of AD. Results 1. In addition to the time variable, the MCMC method had the obvious advantage in multiple missing continuous variables with different missing rates, LOCF method was relatively simple with a certain effect in lower missing rates while the FCS method was not good in all scenarios. When the missing rate of data was large, such as more than 40%, the three imputation methods were not satisfactory. 2. The MCMC method was used to fill the missing data of follow-up of AD and the best filling result for systolic blood pressure and MoCA score was found with three imputation times. Conclusion When dealing with missing data, we need to select the appropriate method and number of imputations, so as to obtain the best result.

关 键 词:纵向缺失数据 填补 阿尔茨海默病 随访资料 

分 类 号:R181.2[医药卫生—流行病学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象