机构地区:[1]西南石油大学土木工程与测绘学院,成都610500 [2]中国科学院地理科学与资源研究所资源与环境信息系统国家重点实验室,北京100101 [3]中山大学地理科学与规划学院,广州510275 [4]四川省安全科学技术研究院,成都610046 [5]四川安信科创有限公司,成都610041
出 处:《地球信息科学学报》2024年第3期620-637,共18页Journal of Geo-information Science
基 金:四川省科技厅重点研发项目(2021YFQ0042);西藏自治区科技计划项目(XZ201901-GA-07);国家重点研发计划课题(2020YFD1100701);中国科学院战略先导专项(A类)(XDA20030302);四川省科研院所基本科研业务费项目(2023JDKY0039-01)。
摘 要:我国西部山区滑坡灾害频发,精确评估滑坡易发性对地质灾害防治至关重要。结合统计方法与机器学习模型的集成模型已广泛的应用于滑坡易发性评价,然而对其结果的进一步优化仍值得考虑。本文提出一种耦合统计方法、机器学习模型以及聚类算法的综合评价方法,以宁南县为例,研究其对滑坡易发性评价精度的提升效应。该方法首先将信息量法(Information Value, IV)、确定系数法(Certainty Factor, CF)和频率比法(Frequency Ratio, FR)分别与随机森林模型(Random Forest, RF)结合,得到三种集成模型(IV-RF、CF-RF、FR-RF)。此后,引入ISO聚类算法对三种集成模型的结果进行分级,得到三种耦合模型(IV-RF-ISO、CF-RF-ISO、FR-RF-ISO)。AUC值(Area Under the Curve)、准确率、F1分数和种子单元面积指数(Seed Cell Area Indexes,SCAI)被用于评估模型的精度。结果显示,集成模型性能均优于单一模型,其准确率和F1分数均大于0.85,AUC值均大于0.9。其中FR-RF模型表现最优,准确率(0.911)、F1分数(0.912)和AUC值(0.965)较FR模型分别提升了0.095、0.096和0.074。与自然断点法和Kmeans聚类法相比,引入ISO算法的耦合模型FR-RF-ISO分级效果最优,其高低易发区SCAI值的差异更为显著。本研究成果表明,耦合统计方法、机器学习与聚类算法的综合评价方法具有较高精度,为提高滑坡易发性评价精度提供思路。Landslides frequently occur in the mountainous areas of western China.Accurate mapping of landslide susceptibility is essential for geohazard management.Integrated models combining statistical methods and machine learning models have been widely applied to landslide susceptibility mapping.However,further optimization of their results is still worth investigation.This study proposes a comprehensive assessment method that couples statistical methods,machine learning models,and clustering algorithms.The effectiveness of the proposed method on improving the accuracy of landslide susceptibility mapping in Ningnan County is investigated.Firstly,the landslide influencing factors are selected from five aspects:geological environment,topography and geomorphology,meteorology and hydrology,vegetation and soil,and human engineering activities in the study area.Indicators are initially selected based on correlation analysis using the Pearson correlation coefficient method,and highly correlated factors are eliminated to establish the landslide susceptibility mapping index system.Next,the Information Value(IV),Certainty Factor(CF),and Frequency Ratio(FR)methods are combined with Random Forest(RF)model respectively to obtain three integrated models(IV-RF,CF-RF,and FR-RF).Then,the ISO clustering algorithm,Natural Breaks clustering,and Kmeans clustering algorithms are introduced to classify the results of the three integrated models,obtaining nine coupled assessment models(IV-RF-ISO,CF-RF-ISO,FR-RF-ISO,IV-RF-NBC,CF-RF-NBC,FR-RF-NBC,IV-RF-Kmeans,CF-RF-Kmeans,and FR-RF-Kmeans).Lastly,Area Under the Curve value(AUC),accuracy,F1 score,and Seed Cell Area Indexes(SCAI)are used to evaluate the accuracy of the models.The results demonstrate that all the integrated models outperform single models.The accuracy and F1 score of all integrated models both exceed 0.85,and their AUC values exceed 0.9.The integrated models effectively address the misclassification of non-landslide samples,which is especially prominent in single IV and CF models.A
关 键 词:滑坡易发性 信息量 确定系数 频率比 随机森林 聚类算法 宁南县
分 类 号:P642.22[天文地球—工程地质学]
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...