基于Model-X Knockoffs的非概率样本倾向得分模型推断方法  

Propensity Score Model Inference Method of Non-probability Samples Based on Model-X Knockoffs

在线阅读下载全文

作  者:刘展 郑俊博 刘洋[2] 潘莹丽 Liu Zhan;Zheng Junbo;Liu Yang;Pan Yingli(Faculty of Mathematics and Statistics,Hubei University,Wuhan 430062,China;School of Economics and Business Administration,Central China Normal University,Wuhan 430079,China)

机构地区:[1]湖北大学、数学与统计学学院,武汉430062 [2]华中师范大学、经济与工商管理学院,武汉430079

出  处:《统计与决策》2023年第4期10-15,共6页Statistics & Decision

基  金:国家社会科学基金一般项目(18BTJ022)。

摘  要:大数据下的样本大多为非概率样本,其入样概率未知,同时可能面临着协变量较多甚至是高维的情况,那么如何对这种情况下的非概率样本进行推断值得探索。针对该问题,文章考虑到Model-X Knockoffs的降维特点,提出采用Model-X Knockoffs筛选出重要变量,建立Logistic倾向得分模型来估计非概率样本的入样概率或倾向得分,对总体进行推断,从而提高估计的精度,同时可控制变量选择的错误发现率与功效。模拟与实证研究结果表明:基于Model-X Knockoffs的Logistic倾向得分模型的总体均值估计相比一般的Logistic倾向得分模型和广义线性回归模型的总体均值估计,偏差更小、效率更高、估计效果更好,并且能很好地控制错误发现率的水平,功效值也接近1。The samples under big data are mostly non-probabilistic samples with unknown inclusion probabilities, and may also face the situation of more covariables or even high dimensions. To solve this problem, this paper considers the dimensionality reduction characteristics of Model-X Knockoffs and proposes the method of using Model-X Knockoffs to select important variables, and then constructs a Logistic propensity score model to estimate selection probabilities or propensity scores of non-probability samples for population inference, so as to improve the accuracy of estimation and control the false discovery rate(FDR) and the power of variable selection. Simulation and empirical analysis results show that the population mean estimator of Logistic propensity score model based on Model-X Knockoffs has smaller bias, higher efficiency and better performance than the population mean estimators of the general Logistic propensity score model and the generalized linear regression model. Besides, the proposed method can control the level of FDR well, and its power is close to 1.

关 键 词:非概率样本 Model-X Knockoffs LASSO 倾向得分 

分 类 号:C811[社会学—统计学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象