基于三种机器学习算法的山洪灾害风险评价  被引量:25

Risk Assessment of Mountain Torrents based on Three Machine Learning Algorithms

在线阅读下载全文

作  者:周超 方秀琴[1] 吴小君[1] 王雨晨 ZHOU Chao;FANG Xiuqin;WU Xiaojun;WANG Yuchen(School of Earth Sciences and Engineering,Hohai University,Nanjing 211100,China)

机构地区:[1]河海大学地球科学与工程学院

出  处:《地球信息科学学报》2019年第11期1679-1688,共10页Journal of Geo-information Science

基  金:国家重点研发计划项目(2016YFA0601500)~~

摘  要:依据洪灾风险概念模型,从触发因子、孕灾环境和承灾体3方面选取江西省的12个洪灾风险指标,采用k近邻、随机森林、AdaBoost 3种机器学习算法构建洪灾风险评价模型。利用精度、Kappa系数、ROC曲线(AUC值)3种定量评估指标评价洪灾风险模型,基于随机森林和Boruta特征提取算法共同分析指标重要性,最后对比3种模型绘制的江西省山洪灾害风险分区图并分析山洪灾害分布特征。结果表明:①AdaBoost模型的精度、Kappa系数和AUC值的平均值为别为0.902、0.870和0.826,精度和Kappa系数略优于随机森林,AUC值与随机森林相当,而k近邻模型的3种性能指标均低于前2种算法;②农田生产潜力、年最大6 h暴雨均值、年最大1 h暴雨均值、归一化差值植被指数、年降雨量均值这5个指标对最终的洪灾风险形成具有非常重要作用;③江西省较高风险区与最高风险区的面积和约占江西省总面积的34.4%,且主要分布于高降雨量、高暴雨量、农田生产潜力大的山区。In China,floods are considered the most frequent natural disaster that can cause serious damages to the safety of human beings and severe economic losses.We chose Jiangxi Province as the study area,which frequently suffered from mountain torrents.According to the conceptual model of flood risk,12 flood risk assessment indexes were selected from three aspects:trigger factor,hazard inducing environment,and hazard bearing agent.Three models of flood risk assessment were constructed using different machine learning algorithms,including k-Nearest Neighbor(kNN),Random Forest(RF),and AdaBoost.To evaluate the models’performances,we applied three quantitative performance indexes:accuracy,Kappa coefficient,and the ROC curve(AUC value).We analyzed the importance of indexes based on Random Forest algorithm and the feature extraction algorithm of Boruta.Then,the zoning maps of mountain flood risk drawn by the three models were used to compare and analyze the pattern of mountain flood disasters.According to the outcomes of the performance analysis,the average values of accuracy,Kappa coefficient,and AUC of the AdaBoost model were0.902,0.870,and 0.826,respectively.The accuracy and Kappa coefficient were slightly higher than RF,the AUC value was equivalent to RF.The three performance indexes of the kNN model were all lower than those of the other two.Our findings suggest that five indexes play very important roles in the formation of the final flood disaster risk,including potential farmland productivity,average annual maximum rainstorm within six hours,average annual maximum rainstorm within one hour,NDVI,and average annual rainfall.Our mapping results show that the areas of higher and highest risk zones account for 34.4%of Jiangxi Province.The regions with higher and highest risk are mainly distributed in the vicinity of mountains with high rainfall,heavy rainstorm,and high potential of farmland production.

关 键 词:随机森林机器学习算法 AdaBoost机器学习算法 ROC曲线 Boruta算法 洪灾风险评价 江西省 

分 类 号:TP3[自动化与计算机技术—计算机科学与技术]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象