ARIMA模型的建立及对中国肺结核月报告例数的预测效果研究  被引量:18

A study of prediction effect of autoregressive integrated moving average model on the monthly reported pulmonary tuberculosis cases in China

在线阅读下载全文

作  者:张顺先 邱磊 张少言[1] 李翠[1] 胡骏[2] 田黎明 鹿振辉[1] ZHANG Shun-xian;QIU Lei;ZHANG Shao-yan;LI Cui;HU Jun;TIAN Li-ming;LU Zhen-hui(Respiratory Research Institute of Longhua Hospital,Shanghai University of Traditional Chinese Medicine,Shanghai 200032,China)

机构地区:[1]上海中医药大学附属龙华医院呼吸疾病研究所,200032 [2]上海中医药大学附属龙华医院微生物室,200032

出  处:《中国防痨杂志》2020年第6期614-620,共7页Chinese Journal of Antituberculosis

基  金:“十三五”国家科技重大专项(2018ZX10725-509)。

摘  要:目的建立自回归移动平均(autoregressive integrated moving average,ARIMA)模型,并对全国(不包括我国港澳台地区,下同)肺结核月报告患者例数进行预测效果研究,为肺结核防控措施的制定提供科学参考。方法通过中国疾病预防控制中心主办的《疾病监测》杂志公布的我国每月甲、乙、丙类传染病疫情动态简介,搜集2006年1月至2019年8月全国肺结核月报告患者例数。采用SPSS 26.0统计学软件,以2006年1月至2018年12月的全国肺结核月报告患者例数为基础建立时间序列,初步识别和定阶ARIMA模型类型;再以满足模型简洁、ARIMA模型各参数[包括自回归法(AR),平均移动法(MA),季节自回归法(SAR),季节移动平均法(SMA)]均有统计学意义(P值均<0.05),以及P>0.05的模型总体检验指标(Ljung-Box Q值)、最大平稳决定系数(R2)、最小整体模型的标准化贝叶斯信息准则值(NBIC)、最小均方根误差(RMSE)为标准筛选几种ARIMA模型;继而以2019年1—8月报告患者例数作为验证数据,参照预测值相对误差越小模型越优的原则筛选出最小相对误差的模型为最优模型;最后再以该模型预测我国2019年9月至2020年12月肺结核月报告患者例数。结果根据2006—2018年每年的全国肺结核月报告患者例数为基础建立时间序列,确定需拟合ARIMA(p,d,q)或ARIMA(p,d,q)×(P,D,Q)模型。以Ljung-Box Q值所对应的P值均>0.05、模型简洁、模型各参数均有统计学意义(P值均<0.05)筛选出12个基本模型,然后再以R2最大的模型[ARIMA(1,0,1)(0,1,1)12,R2=0.707]、RMSE最小的模型[ARIMA(0,1,2)(0,1,1)12,RMSE=9147.85]、NBIC最小的模型[ARIMA(0,1,1)(0,1,1)12,NBIC=18.355]、Ljung-Box Q值最小的模型[ARIMA(1,1,1)(0,1,1)12,Ljung-Box Q=8.797]作为备用模型,预测2019年1—8月中国肺结核月报告患者例数,并与实际的月报告患者例数进行比较,确定预测平均相对误差最小(0.55%)、MA(1)=0.875(t=19.243,P<0.001)、SMA(1)=0.876(t=7.596,P<0.Objective An autoregressive integrated moving average(ARIMA)model was used to predict the monthly pulmonary tuberculosis cases in China(excluding Hong Kong,Macao and Taiwan regions)to provide a reference for pulmonary tuberculosis prevention and control.Methods Monthly pulmonary tuberculosis cases number in China from January 2006 to December 2018 reported on Disease Surveillance sponsored by CDC were collected.Based on these data,time series,preliminary identification and ordering of ARIMA model types were conducted using SPSS 26.0.Several ARIMA models were selected according to that both the simplicity of the model and the parameters of the ARIMA model(including autoregressive method(AR),average moving method(MA),seasonal autoregressive method(SAR),seasonal moving average method(SMA))were statistically significant(Ps<0.05),as well as the overall test index(Ljung-Box Q value),maximum stationary coefficient(R2)of the model,standardized Bayesian information criterion value(NBIC)of the smallest overall model,and minimum root mean square error(RMSE).Numbers of reported cases from January to August 2019 were used as verification,and the model with the smallest relative error was selected as the optimal model according to that the smaller the relative error of the predicted value,the better the model;finally,the model was used to predict monthly reported numbers of tuberculosis patients from September 2019 to December 2020 in China.Results Time series were based on cases from January 2006 to December 2018,the fitted model was ARIMA(p,d,q)or ARIMA(p,d,q)×(P,D,Q).Twelve models were selected according to P value(which is relative to Ljung-Box Q)>0.05,the simplicity of the model,and parameters of the model were statistically significant(all P<0.05);and models with the maximum R2(ARIMA(1,0,1)(0,1,1)12,R2=0.707)),or with the minimum RMSE(ARIMA(0,1,2)(0,1,1)12,RMSE=9147.85),or with the minimum NBIC(ARIMA(0,1,1)(0,1,1)12,NBIC=18.355)),or with the minimum Ljung-Box Q(ARIMA(1,1,1)(0,1,1)12,Ljung-Box Q=8.797))were taken as alte

关 键 词:结核  疾病报告 流行病学研究设计 模型 统计学 预测 

分 类 号:R52[医药卫生—内科学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象