新型冠状病毒肺炎发病日期估计及影响因子分析  被引量:3

Estimating the onset dates of 2019-nCoV Cases and its impact factors

在线阅读下载全文

作  者:周雨 张国平[2] 薛逸飞 汪如良 万冉冉 庞华基 ZHOU Yu;ZHANG Guo-ping;XUE Yi-fei;WANG Ru-liang;WAN Ran-ran;PANG Hua-ji(Jiangxi Meteorological Service Center,Jiangxi,Nonchang 330096,China)

机构地区:[1]江西省气象服务中心,江西南昌330096 [2]中国气象局公共气象服务中心 [3]江西省基础地理信息中心 [4]江西省测绘地理信息工程研究中心 [5]青岛市气象局

出  处:《现代预防医学》2020年第24期4422-4426,共5页Modern Preventive Medicine

基  金:国家自然科学基金(41871020);中国气象局公共气象服务中心创新基金项目(M2020032)。

摘  要:目的探讨发病日期影响因子,估计缺失发病日期,完善新型冠状病毒肺炎病例数据库,为开展疫情大数据分析提供参考依据。方法基于江西省各级卫生健康委员会公开的2020年1月22日-2月25日新增2019-n Co V确诊数据,建立时间序列数据库,分析发病日期与确诊日期的分布特征。采用随机森林算法,研究确诊日期、患者信息(性别、年龄、有无去过武汉等)、患者居住地经纬度、患者居住地与南昌的距离等因子与发病日期的关系。以均方根误差(RMSE)、决定系数(R2)2个指标评价模型估计准确度,并通过计算精度平均下降率给出了各影响因子对发病日期估计的重要性排序。结果确诊日期在影响发病日期估计方面发挥了决定性作用,距离和经纬度也在是模型估计中的重要因子。70%左右病例发病日期与确诊日期存在2~7天时间差,其中3天为最多数;利用随机森林算法对缺失的发病日期模拟估计,检验结果显示发病日期最优估计值R2为0.98,表明估计值与实际值基本吻合,模型估计效果好。结论随机森林模型能够比较全面地描述发病日期的影响因子,且直观、便捷,可用于指导完善患者信息和修正传染病传播预测参数。Objective To explore the factors affecting the onset date,and estimate the absence date,in order to improve the new cases database of the coronavirus pneumonia,and thus to provide a reference for the big data analysis of epidemic.Methods Time series database was established to analyze the distribution characteristics of the onset date and the confirmed date based on the new 2019-nCoV diagnosis data from January 22 to February 25,2020.Random forest algorithm was used to study the relationship between the confirmed date,patient information(sex,age,visit to Wuhan,etc.),the latitude and longitude of the patient’s residence,the distance between the patient’s residence and Nanchang,and the onset date.Root mean square error(RMSE)and coefficient of determination(R2)were used to evaluate the estimation accuracy of the mode.The average drop rate was calculated to give the importance of these factors.Results The diagnosis date played a decisive role in the estimation of the onset date,besides distance and latitude and longitude were also important factors.It was found that of about 70%cases there was a time difference of 2-7 days between the onset date and the diagnosis date,of which 3 days are the most.The results showed that the optimal estimate of the onset date was 0.98,indicating that the estimated value was basically consistent with the actual value.Conclusion The random forest model could comprehensively describe the influencing factors of the onset date,and was intuitive and convenient.It could be used to guide and improve the patient information and correct the prediction parameters of infectious disease transmission model.

关 键 词:发病日期 确诊日期 随机森林 影响因子 

分 类 号:R181.2[医药卫生—流行病学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象