零膨胀对数级数分布的参数估计  被引量:2

Parameters Estimation for the Zero-Inflated Logarithmic Series Distribution

在线阅读下载全文

作  者:盛建为 钱夕元[1] SHENG Jianwei;QIAN Xiyuan(School of Science,East China University of Science and Technology,Shanghai 200237,China)

机构地区:[1]华东理工大学理学院

出  处:《华东理工大学学报(自然科学版)》2019年第3期507-510,共4页Journal of East China University of Science and Technology

基  金:国家高科技研究发展计划(“863”计划)资助项目(2015AA20107)

摘  要:对数级数分布是一种常见的长尾分布,在取值为正整数的计数数据中有着广泛的应用。然而在实际中,某些计数数据含有大部分的 0,因此本文将传统的对数级数分布推广至零膨胀对数级数分布,并讨论了该分布参数的矩估计、极大似然估计以及贝叶斯估计。同时通过蒙特卡洛方法产生模拟数据,并通过均方误差比较了这些估计方法的优劣,结果表明贝叶斯估计优于其他传统估计方法,且在小样本情况下优势更加明显。最后使用该模型对实际中的临床再入院次数进行了拟合分析。The logarithmic series distribution is a common long-tailed distribution and has a wide range of applications in count data with positive integers, such as the species abundance in some forest and the types of fish in a sea area. In practice, however, some count data contains most of the zeros which is not suitable for logarithmic series distribution. To fit the excessive zeros in the count data, this paper extends the logarithmic series distribution to a zeroinflated logarithmic series distribution in the frame of the zero-inflated model. Three methods of parameter estimations, that are moment estimation, maximum likelihood estimation and Bayesian estimation, were used to estimate the parameters in the model. In the Bayesian estimation, the posterior distribution is constructed by the random walk metropolis algorithm since there is no analytical method for the posterior distribution. The Monte Carlo method is used to generate the simulation data of the zero-inflated logarithmic series distribution, and the mean square error is the metric which is used to compare the accuracies of different estimation methods. The results show that Bayesian method has a higher accuracy than other traditional estimation methods in case the sample size is small. Moreover, the precision of Bayesian method is comparable with the traditional method when the sample size is big, which suggests that Bayesian method has advantage in case there are only few samples. Finally, the model was used to fit the number of clinical readmissions within ninety days which has more than sixty percent zeros and led to a fairly good fitness.

关 键 词:对数级数分布 零膨胀模型 贝叶斯方法 

分 类 号:O213.9[理学—概率论与数理统计]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象