基于特征选择和机器学习的森林蓄积量估算  

Estimating forest stock volume based on feature selection and machine learning

在线阅读下载全文

作  者:赵娅冰 彭道黎[1] 郭发苗 王荫 黄静娴 Zhao Yabing;Peng Daol;i Guo Famiao;Wang Yin;Huang Jingxian(Key Laboratory of Forest Resources&Environmental Management of National Forestry and Grassland Administration,Beijing Forestry University,Beijing 100083,China)

机构地区:[1]北京林业大学森林资源和环境管理国家林草局重点实验室,北京100083

出  处:《北京林业大学学报》2025年第4期155-167,共13页Journal of Beijing Forestry University

基  金:国家重点研发计划项目(2023YFD2200403)。

摘  要:【目的】基于多源遥感数据,评估不同特征选择方法和机器学习算法组合构建的森林蓄积量估算模型的准确性,挖掘其协同互补潜力,以期有效提高森林蓄积量的估算精度。【方法】以河北省第九次国家森林资源连续清查数据为基础,结合GF-1、Sentinel-2、Sentinel-1和ASTER GDEM 4种遥感数据,采用随机森林变量选择(VSURF)、递归特征消除(RFE)和Boruta 3种特征选择方法,以及支持向量回归(SVR)、K-最近邻(KNN)、随机森林(RF)、分类提升(CatBoost)和极端梯度提升(XGBoost)5种机器学习算法,构建蓄积量模型,并筛选出最优模型。此外,通过方差分析量化数据集、特征选择和机器学习算法这3个因素对森林蓄积量估算的影响。【结果】(1)方差分析结果表明,数据集、特征选择和机器学习算法均对蓄积量估算性能有显著影响。(2)多源遥感数据的结合能有效提高森林蓄积量的估算性能。与其他数据集相比,联合GF-1、Sentinel-2、Sentinel-1和ASTER GDEM数据构建的模型表现出更高的估算精度。从整体来看,Boruta特征选择方法优于VSURF和RFE。CatBoost在建模中的表现优于其他算法(SVR、KNN、RF和XGBoost)。(3)基于GF-1、Sentinel-2、Sentinel-1和ASTER GDEM的组合,使用Boruta特征选择方法和CatBoost机器学习算法构建的估算模型实现了最高的准确性(R^(2)=0.6385,RMSE=13.3053 m^(3)/hm^(2))。【结论】基于多源遥感数据估算保定市森林蓄积量时,结合特征选择和机器学习算法可显著优化模型的估算效果,得到更精准的蓄积量估算结果。研究结果不仅改进了当前应用多源遥感数据估算森林蓄积量的方法,还为大范围森林蓄积量监测提供了新的思路和参考依据。[Objective]Based on multi-source remote sensing data,the accuracy of forest stock volume estimation models constructed by combining different feature selection methods and machine learning algorithms was evaluated,and their synergistic and complementary potentials were explored to effectively improve the estimation accuracy of forest stock volume.[Method]Based on the data of the 9th National Forest Resources Continuous Inventory in Hebei Province of northern China,this study combined four types of remote sensing data,i.e.,GF-1,Sentinel-2,Sentinel-1 and ASTER GDEM,and employed three types of feature selection methods,i.e.,variable selection using random forests(VSURF),recursive feature elimination(RFE)and Boruta,and five types of machine learning algorithms,i.e.,support vector egression(SVR),K-nearest neighbor(KNN),random forest(RF),categorical boosting(CatBoost)and extreme gradient boosting(XGBoost)to construct forest stock volume model and screen the optimal model.In addition,the effects of three factors,i.e.,dataset,feature selection and machine learning algorithms,on the estimation of forest stock volume were quantified by analysis of variance(ANOVA).[Result](1)The results of ANOVA showed that the dataset,feature selection and machine learning algorithms all had a significant impact on performance of forest stock volume estimation.(2)The combination of multi-source remote sensing data can effectively improve the performance of forest stock volume estimation.Compared with other datasets,the model constructed by combining the GF-1,Sentinel-2,Sentinel-1 and ASTER GDEM data showed higher estimation accuracy.On the whole,the Boruta feature selection method was superior to VSURF and RFE.CatBoost outperformed other algorithms(SVR,KNN,RF and XGBoost)in modeling.(3)Based on the combination of GF-1,Sentinel-2,Sentinel-1,and ASTER GDEM,the estimation model built using Boruta for feature selection and CatBoost machine learning algorithm achieved the highest accuracy(R^(2)=0.6385,RMSE=13.3053 m^(3)/ha).[Conclusion]In the e

关 键 词:森林蓄积量 多源遥感数据 特征选择 机器学习算法 集成学习 

分 类 号:S771.8[农业科学—森林工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象