县域随机森林数字土壤属性制图预处理优化方法  

在线阅读下载全文

作  者:王凤仪 赵东保[1] 刘湃 肖炼 

机构地区:[1]华北水利水电大学,郑州450046 [2]自然资源部四川基础地理信息中心,成都610041

出  处:《智慧农业导刊》2025年第5期41-45,共5页JOURNAL OF SMART AGRICULTURE

基  金:国家自然科学基金(41971346);四川省科技计划项目重点研发项目(2022YFN002)。

摘  要:随机森林是数字土壤属性制图的重要方法,该文考虑数据不平衡性和环境变量多重共线性问题,对随机森林制图方法预处理阶段开展优化处理研究。该研究以河南省邓州市2007年表层土壤样点的pH推测制图为例,针对pH数据分布的不平衡性,采用SMOGN算法确保pH推测范围符合实际分布情况。针对环境变量的多重共线性问题,对比分析膨胀因子,主成分分析和逐步回归等方法的制图精度,并给出消除多重共线性的方法。当顾及数据不平衡性和消除多重共线性后,全部样点的平均绝对误差和均方根误差精度指标均获得提升。土壤pH范围更广,对pH的极端值也能够进行推测。该文方法可有效保障pH推测值的分布范围更符合实际情况,并提升随机森林方法的pH推测精度。Random forest is an important method for digital soil attribute mapping.In this paper,considering the data imbalance and multicollinearity of environmental variables,an optimization study is carried out in the pretreatment stage of random forest mapping method.This study took the pH prediction mapping of surface soil sample points in Dengzhou City,Henan Province in 2007 as an example.In view of the imbalance of pH data distribution,SMOGN algorithm was used to ensure that the pH prediction range was in line with the actual distribution.Aiming at the problem of multicollinearity of environmental variables,the mapping accuracy of methods such as dilation factor,principal component analysis and stepwise regression is compared and analyzed,and a method to eliminate multicollinearity is given.When data imbalance is taken into account and multicollinearity is eliminated,the average absolute error and root-mean-square error accuracy indicators of all sample points are improved.Soil pH have a wider range,and extreme pH can also be speculated.The method in this paper can effectively ensure that the distribution range of estimated pH is more in line with the actual situation,and improve the accuracy of pH estimation by the random forest method.

关 键 词:数字土壤属性制图 土壤PH 随机森林 数据不平衡性 多重共线性 

分 类 号:S159-3[农业科学—土壤学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象