检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
作 者:何苗[1] 曹爽[2] 王双 沈铁峰[4] 黄德生[5] 关鹏[2]
机构地区:[1]中国医科大学附属第一医院信息中心,110001 [2]中国医科大学公共卫生学院流行病学教研室,110001 [3]辽宁省沈阳市沈河区卫生局,110061 [4]辽宁省葫芦岛市疾病预防控制中心,125000 [5]中国医科大学基础医学院数学教研室,110001
出 处:《中国卫生统计》2014年第1期6-9,共4页Chinese Journal of Health Statistics
基 金:国家自然科学基金项目(71073175)
摘 要:目的探讨应用支持向量机递归特征约减算法(support vector machine with recursive feature elimination algorithm,SVM-RFE)进行痢疾疫情预测的可行性。方法收集辽宁省葫芦岛市2004-2011年的逐月痢疾疫情资料和相应时段的气象资料,首先利用描述统计分析痢疾季节性发病规律,使用Spearman等级相关分析疫情同气象因素的关系;使用标准化气象指标作为自变量,随机将2/3数据用于训练,1/3数据用于检验,设置交叉验证次数为100次,根据径向基核函数的SVM-RFE确定最优候选变量子集并据此进行预测,利用R2.90完成上述统计过程。结果 SVM-RFE在17项气象指标中按重要程度由高至低排序,居前5位分别为平均气温、平均最高气温、降水距平百分率、平均风速、平均最低气温。随着自变量的增加,训练集的决定系数R2由0.702增加到0.945,检验集在取前两个自变量时决定系数最大,R2为0.653,均高于传统对数线性模型。结论 SVM可较好地模拟痢疾疫情在时间序列上的变动趋势,RFE算法在筛选变量方面有较好的应用前景。Objective To explore the feasibility of predicting the dysentery incidence by means of support vector machine with recursive feature elimination algorithm (SVM-RFE). Methods Dysentery incidence data was provided by Huludao Mu- nicipal Center for Disease Control and Prevention, and corresponding meteorological data were retrieved from the China Bureau of Meteorology. Firstly, spearman rank correlation was performed for seeking the relationship between incidence of dysentery and climate. Secondly, data was divided into training and test set with the ratio 2:1 for 100 times cross-validation, RFE algo- rithm with radial kernel function was applied to search for the optimum subset of candidate variables. R2.90 software was used for performing the above statistical analysis based on the standard and principal component factors of standard monthly meteoro- logical variables in Huludao from January 2004 to December 2011. Results The descending order for the importance of explora- tory climatic variables indicated that average atmospheric temperature, average maximum temperature, deviation from average percent of precipitation, average wind velocity and average minimum temperature ranked top 5. The determinant coefficient of training set increased with more exploratory variables, from 0. 702 to 0. 945. The maximum determinant coefficient of test set was 0. 653 with the top two exploratory variables, average atmospheric temperature and average maximum temperature. The per- formance evaluated by cross validation for nu-regression SVM was better than that of traditional log-linear regression. Conclu- sion The SVM could be used to fit the changes of the dysentery incidence and the RFE algorithm has the potential application of searching for the optimum variables.
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.229