权数对基于模型推断的影响分析  被引量:8

Analysis of the Effect of Design Weights on Model-based Inference

在线阅读下载全文

作  者:金勇进[1,2,3] 刘晓宇 JIN Yong-jin;LIU Xiao-yu(Center for Applied Statistics,Renmin University of China,Beijing 100872,China;School of Statistics,Renmin University of China,Beijing 100872,China;Institute of Survey Technology,Renmin University of China,Beijing 100872,China)

机构地区:[1]中国人民大学应用统计科学研究中心,北京100872 [2]中国人民大学统计学院,北京100872 [3]中国人民大学调查技术研究所,北京100872

出  处:《统计与信息论坛》2022年第3期3-13,共11页Journal of Statistics and Information

基  金:全国统计科学研究重点项目“大型抽样调查样本整合及其有效性研究”(2020LZ27)。

摘  要:利用抽样调查数据对总体参数进行推断通常分为两种途径:一种是基于设计的推断体系;另一种是基于模型的推断体系。基于设计的推断以随机化理论为基础,推断依赖于抽样设计,在大样本下估计量具有无偏性和一致性,但在样本量较小或存在非抽样误差等情况下效率较低。基于模型的推断认为有限总体是一个来自无限超总体的随机样本,推断依赖于模型假设,构建超总体模型具有很大的灵活性,有利于充分利用总体辅助信息并提高估计精度,但在模型假定有误或样本的入样过程不具有无信息性时存在估计误差。如何将两种推断途径相结合,在体现样本对总体代表性的同时,保证估计效率和估计量的优良性质,尚待研究。权数在基于设计的推断中起着核心作用,能够反映抽样设计对样本的影响,实现样本对总体的还原。将权数引入基于模型的推断,可以使基于模型推断的结果具有总体代表性,能更好地发挥两种推断体系的组合优势,并削弱模型假定对推断效果的影响。据此,从权数对于模型推断的影响入手,针对因果推断问题,提出将权数同时引入倾向得分模型和预测模型的建模过程,来构造双稳健估计的方法,并通过模拟研究加以验证。最终结果表明,根据文章所提出的方法进行处理效应的估计,能够充分发挥权数的作用,得到更准确、更稳健的估计结果。实证部分采用2017年CGSS调查数据进行分析,进一步说明在基于调查数据进行模型推断时应充分考虑抽样设计的影响,为科研人员进行因果推断以及其他基于调查数据开展的研究提供参考。There are two ways based on sample survey data to infer population parameters,one is the design-based approach,the other is the model-based approach.The design-based approach is based on the theory of randomization and depends on sampling design.The estimation is unbiased and consistent in a large sample,but it is inefficient under the small sample or non-sampling error.The model-based approach considers that the finite population is a random sample of a super-population,and the results depend on the model assumption.The construction of super-population model has great flexibility,which is beneficial to make full use of the auxiliary information and improve the estimation accuracy.However,the estimation error exists when the model assumption is wrong or the sampling machine is not non-informative.It remains to be studied how to combine the two inference approaches to ensure the efficiency of estimation and the good properties of estimators while reflecting the representativeness of samples to the population.Sampling weights play a key role in design-based inference,which can reflect the influence of sampling design on samples and realize the reduction of samples to the population.The use of weights in model-based inference can make the results of model-based inference representative,better play the combined advantages of the two inference systems,and weaken the influence of model assumptions on the results.Therefore,starting from the influence of weights on model-based inference,this paper proposes to introduce weights into the modeling process of propensity score model and prediction model at the same time for causal inference,and constructs the double robust estimation,which is verified by simulation research.The final results show that the method proposed here can play the role of weights and get more accurate and robust estimation results.The empirical part uses the 2017 CGSS data,which further indicates that the influence of sampling design should be fully considered when making model-based inference based o

关 键 词:权数 基于模型的推断 因果推断 双稳健估计 

分 类 号:C81[社会学—统计学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象