土壤属性数据pH缺失的插补方法  被引量:3

Imputation Method to Predict Missing pH Data of Soil Attribute

在线阅读下载全文

作  者:张逸飞 曹佳[1,2] ZHANG Yi-Fei;CAO Jia(School of Information Science and Technology,Beijing Forestry University,Beijing 100083,China;Engineering Research Center for Forestry-Oriented Intelligent Information Processing of National Forestry and Grassland Administration,Beijing 100083,China)

机构地区:[1]北京林业大学信息学院,北京100083 [2]国家林业草原林业智能信息处理工程技术研究中心,北京100083

出  处:《计算机系统应用》2021年第1期277-281,共5页Computer Systems & Applications

基  金:国家自然科学基金(61602042)。

摘  要:土壤分析研究中属性数据缺失的现象时常发生,为了提高研究结果的可靠性,有必要对土壤属性数据的缺失值插补方法进行研究.从数据挖掘的角度利用多种缺失值处理方法来对缺失值进行插补,以中国主要农田生态系统土壤养分数据库的pH属性为研究对象,并且从真实值和插补值的拟合优度和插补误差两个方面评估各个方法在不同缺失率的数据集上的表现.结果表明,对比其他方法,如多元回归、SVM、神经网络,采用最优参数的KNN和随机森林插补方法对土壤属性数据pH进行插补是有效可行的.KNN和随机森林在不同缺失率的数据集上插补缺失数据pH的MAE、RMSE和R^2的均值分别为0.132和0.131,0.174和0.178,0.775和0.765.The problem of the absence of attribute data often occurs in soil analysis and research.To improve the reliability of the research results,it is necessary to study the imputation methods for soil attribute missing data.In this study,a variety of imputation methods have been evaluated to interpolate the soil attribute missing data from the perspective of data mining.Using soil attribute pH as an interpolation object,the Soil Nutrient Database of China’s Major Ecosystems is used as the source of physical and chemical soil attribute data.We evaluate the performance of each method on the dataset of different missing rates in terms of model fitting and imputation error.The result shows that it is feasible to impute soil attribute pH missing data using the optimal parameter K-Nearest Neighbor(KNN)and random forest than other methods,such as multivariable regression,support vector machine,and neural network.The mean value of MAE、RMSE and R^2 of the imputed missing data pH of KNN and random forest on the dataset with different missing rates are 0.132 and 0.131,0.174 and 0.178,0.775 and 0.765,respectively.

关 键 词:土壤属性数据 PH 缺失数据 K最近邻居 随机森林 

分 类 号:S151.93[农业科学—土壤学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象