定量纵向数据缺失值处理方法的模拟比较研究  被引量:14

Missing Data Handing Methods of Quantitative Longitudinal Data:A Simulation Study

在线阅读下载全文

作  者:陈丽嫦 衡明莉 王骏[2] 陈平雁 Chen Lichang;Heng Mingli;Wang Jun(Department of Biostatistics,School of Public Health,Southern Medical University(510515),Guangzhou)

机构地区:[1]南方医科大学公共卫生学院生物统计学系,510515 [2]国家药品监督管理局药品审评中心

出  处:《中国卫生统计》2020年第3期384-388,共5页Chinese Journal of Health Statistics

摘  要:目的比较末次观测结转法(LOCF)、重复测量的混合效应模型法(MMRM)、多重填补法(MI)在处理纵向缺失数据中的统计性能。方法以双臂设计、4次访视、3种访视间相关程度为应用背景,采用Monte Carlo模拟技术,产生模拟完整纵向数据后考虑两种缺失比例和三种缺失机制,即完全随机缺失(MCAR)、随机缺失(MAR)和非随机缺失(MNAR)的缺失数据集。以完整纵向数据的分析结果为基准,评价不同处理方法的统计性能,包括Ⅰ类错误、检验效能、组间疗效差的估计误差及其95%置信区间(95%CI)宽度。结果所有情况下,MMRM和MI均可控制Ⅰ类错误,检验效能略低于完整数据;LOCF大多难以控制Ⅰ类错误,检验效能变异较大。多数情况下MMRM和MI的点估计误差较低,LOCF则表现不稳定。所有情况下,MI的95%CI最宽,MMRM次之,LOCF最窄。结论 MCAR和MAR缺失机制下,MMRM与MI的统计性能相当,受各种因素影响较有规律,可根据实际情况选择其中一个作为主要分析。LOCF因填补方法的特殊性使得变异较小,精度较高,但其最大的缺陷是不够稳健且不能有效控制I类错误,需谨慎使用。基于MNAR缺失机制对缺失数据进行敏感性分析以考察试验结果的稳健性是必要的。Objective This study aims to evaluate the statistical performance of Last Observation Carried Forward(LOCF),Mixed Model for Repeated Measurements(MMRM)and Multiple Imputation(MI)approaches in missing data handling.Methods Under the situation of a two-arm trial with four visits and three different correlation matrixes between visits,we used Monte Carlo method to simulate the completed datasets and then generate corresponding missing datasets under various situations,including missing rates and missing mechanisms(MCAR,MAR and MNAR).We evaluated the performance of the different methods considered using typeⅠerror,power,bias between groups and the width of 95%confidence interval(95%CI)compared with the performance of the completed datasets analysis.Results In all scenarios we considered,both MMRM and MI controlled typeⅠerror well and slightly reduced power compared with completed dataset analysis.In most scenarios,TypeⅠerror of LOCF was not well controlled and the variability of power was large.In most scenarios,MMRM and MI had the smaller bias,whereas LOCF performed unsteadily.In all scenarios we considered,MI had the largest width of 95%CI,followed by MMRM and LOCF.Conclusion MMRM and MI approaches can be considered as the primary statistical methods under certain circumstances because they performed equally well and were regularly affected by other factors under MCAR and MAR missing mechanism.LOCF underestimated the variability and hence improved precision because of its specific imputation method,but its biggest disadvantages were the weak robustness as well as the weak control of typeⅠerror.LOCF should be used with caution.It is essential to do sensitivity analyses based on the MNAR missing mechanism to assess robustness of trial results.

关 键 词:缺失数据 纵向数据 末次观测结转法 重复测量的混合效应模型 多重填补 

分 类 号:R195.1[医药卫生—卫生统计学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象