基于两步子抽样算法的多目标抽样统计推断研究  

Research on Multi-Objective Sampling Method Based on Two-Step Subsampling Algorithm

在线阅读下载全文

作  者:李莉莉[1] 周楷贺 杜梅慧 LI Li-li;ZHOU Kai-he;DU Mei-hui(School of Economic,Qing Dao University,Qingdao 266100,China;School of Economic,Nankai University,Taijing 300350,China)

机构地区:[1]青岛大学经济学院,山东青岛266100 [2]南开大学数量经济研究所,天津300072

出  处:《数理统计与管理》2023年第6期1037-1060,共24页Journal of Applied Statistics and Management

基  金:国家社科基金项目(2019BTJ028)。

摘  要:针对海量数据,子抽样算法是当前一种流行的简化计算和降低计算成本的方法。现阶段的研究主要集中于单目标变量的估计上。多目标抽样也是现实生活中经常遇到的问题。本文提出基于广义线性模型,多目标抽样的均值两步子抽样算法。两步子抽样算法是Wang等(2018)[1]提出的基于L-最优和A-最优的思想,确定每个抽样单元的入样概率。本文在此基础上,定义多目标抽样的各单元的入样概率,并推导模型参数估计量的渐近性质,最后用模拟数据和实际例子对均值两步子抽样算法和多目标两步子抽样方法进行比较。结果表明,在样本量相同时,A-最优准则下均值两步子抽样算法在估计精度上优于基于两步子抽样算法的MPPS抽样和L-最优准则下均值多目标两步子抽样算法。在计算效率上也较全样本估计有显著的提高,节约了计算时间。For massive data,subsampling algorithm is a popular method to simplify calculation and reduce calculation cost.At present,the research focuses on the estimation of single objective variables.Multi-objective sampling is also a problem often encountered in real life.In this paper,a mean two-step sampling algorithm based on generalized linear model is proposed.The two-step subsampling algorithm was proposed by Wang et al.(2018)[1]based on L-optimality and A-optimality ideas to determine the sampling probability of each sampling unit.In this paper,the inclusion probabilities of each unit based on each objective variable are defined,and the asymptotic properties of the model parameter estimator are derived.Finally,using the simulated data and the practical examples,the mean two-step subsampling algorithm and the multi-objective twostep subsampling method are compared.The results show that when the sample size is the same,the average two-step sampling algorithm under the A-optimality is better than MPPS sampling based on the two-step sampling algorithm and the average multi-objective two-step sampling algorithm under the L-optimality.Compared with the full sample estimation,the calculation efficiency is significantly improved and the calculation time is saved.

关 键 词:大数据 两步子抽样算法 广义线性模型 

分 类 号:O212.2[理学—概率论与数理统计]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象