检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:周雨 张国平[2] 薛逸飞 汪如良 万冉冉 庞华基 ZHOU Yu;ZHANG Guo-ping;XUE Yi-fei;WANG Ru-liang;WAN Ran-ran;PANG Hua-ji(Jiangxi Meteorological Service Center,Jiangxi,Nonchang 330096,China)
机构地区:[1]江西省气象服务中心,江西南昌330096 [2]中国气象局公共气象服务中心 [3]江西省基础地理信息中心 [4]江西省测绘地理信息工程研究中心 [5]青岛市气象局
出 处:《现代预防医学》2020年第24期4422-4426,共5页Modern Preventive Medicine
基 金:国家自然科学基金(41871020);中国气象局公共气象服务中心创新基金项目(M2020032)。
摘 要:目的探讨发病日期影响因子,估计缺失发病日期,完善新型冠状病毒肺炎病例数据库,为开展疫情大数据分析提供参考依据。方法基于江西省各级卫生健康委员会公开的2020年1月22日-2月25日新增2019-n Co V确诊数据,建立时间序列数据库,分析发病日期与确诊日期的分布特征。采用随机森林算法,研究确诊日期、患者信息(性别、年龄、有无去过武汉等)、患者居住地经纬度、患者居住地与南昌的距离等因子与发病日期的关系。以均方根误差(RMSE)、决定系数(R2)2个指标评价模型估计准确度,并通过计算精度平均下降率给出了各影响因子对发病日期估计的重要性排序。结果确诊日期在影响发病日期估计方面发挥了决定性作用,距离和经纬度也在是模型估计中的重要因子。70%左右病例发病日期与确诊日期存在2~7天时间差,其中3天为最多数;利用随机森林算法对缺失的发病日期模拟估计,检验结果显示发病日期最优估计值R2为0.98,表明估计值与实际值基本吻合,模型估计效果好。结论随机森林模型能够比较全面地描述发病日期的影响因子,且直观、便捷,可用于指导完善患者信息和修正传染病传播预测参数。Objective To explore the factors affecting the onset date,and estimate the absence date,in order to improve the new cases database of the coronavirus pneumonia,and thus to provide a reference for the big data analysis of epidemic.Methods Time series database was established to analyze the distribution characteristics of the onset date and the confirmed date based on the new 2019-nCoV diagnosis data from January 22 to February 25,2020.Random forest algorithm was used to study the relationship between the confirmed date,patient information(sex,age,visit to Wuhan,etc.),the latitude and longitude of the patient’s residence,the distance between the patient’s residence and Nanchang,and the onset date.Root mean square error(RMSE)and coefficient of determination(R2)were used to evaluate the estimation accuracy of the mode.The average drop rate was calculated to give the importance of these factors.Results The diagnosis date played a decisive role in the estimation of the onset date,besides distance and latitude and longitude were also important factors.It was found that of about 70%cases there was a time difference of 2-7 days between the onset date and the diagnosis date,of which 3 days are the most.The results showed that the optimal estimate of the onset date was 0.98,indicating that the estimated value was basically consistent with the actual value.Conclusion The random forest model could comprehensively describe the influencing factors of the onset date,and was intuitive and convenient.It could be used to guide and improve the patient information and correct the prediction parameters of infectious disease transmission model.
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.222