基于多种机器学习模型的石河子棉区棉花产量预测研究  

Research on Cotton Yield Prediction in Shihezi Cotton Region Based onMultiple Machine Learning Models

在线阅读下载全文

作  者:孙帅 李顺澳 彭冬梅 王森 郭燕云 王雪姣 SUN Shuai;LI Shun'ao;PENG Dongmei;WANG Sen;GUO Yanyun;WANG Xuejiao(Institute of Desert Meteorology,China Meteorological Administration,Urumqi830002,China;Wulanwusu Ecology and Agrometeorology Observation and Research Station of Xinjiang,Shihezi832000,China;Xinjiang Agro-meteorological Observatory/Information Centre of Xinjiang Xingnong-Net,Urumqi830002,China)

机构地区:[1]中国气象局乌鲁木齐沙漠气象研究所,新疆乌鲁木齐830002 [2]乌兰乌苏生态与农业气象新疆野外科学观测研究站,新疆石河子832000 [3]新疆维吾尔自治区农业气象台/新疆兴农网信息中心,新疆乌鲁木齐830002

出  处:《沙漠与绿洲气象》2024年第6期166-172,共7页Desert and Oasis Meteorology

基  金:中国沙漠气象科学研究基金(Sqj2022002);新疆气象科技创新发展基金重点项目(ZD202105);新疆维吾尔自治区天山英才计划-三农骨干人才(2022SNGGCC013);棉花轻简高效栽培技术创新团队(2023TSYCTD004)。

摘  要:气候变化加剧了作物产量预测的不确定性,传统预测模型在处理复杂数据和长期预测时具有局限性。以新疆石河子棉区为研究对象,基于1983—2021年气象资料、棉花种植面积以及单产统计数据,运用K-近邻、梯度提升机、随机森林以及单层神经网络4种机器学习模型,以均方根误差(RMSE)、归一化均方根误差(NRMSE)、模型效率指数(EF)和决定系数(R^(2))作为评判模型预测效能的指标,针对棉花产量预测的复杂性,构建能综合分析多因素交互效应的产量预测模型,以期为作物产量预测提供新方法。结果表明:在单项模型学习效果方面,随机森林、梯度提升机和单层神经网络表现最优,K-近邻模型预测效果较差。随机森林、梯度提升机和单层神经网络集成模型在学习和预测效果上均优于各单项模型。NRMSE分别为6.38、7.91%,RMSE分别为109.4和114.67(kg·hm^(-2)),EF和R2均大于0.95,表明集成模型可为石河子棉区棉花产量预测提供一种全新、高效的方法。同时,仍需探索更合理的模型融合策略,不断优化算法,进一步提高预测结果的准确度和稳定性。Climate change exacerbates the unpredictability of crop yield forecasting,and conventional prediction models struggle with limitations when handling intricate datasets and forecasting over extended periods.This study focuses on the cotton cultivation area in Shihezi,employing machine learning methodologies to tackle the intricacies of cotton yield prediction.The objective is to establish a yield prediction model proficient incomprehensively evaluating the interacting effects of multiple variables,thereby furnishing a novel approach to crop yield estimation.By utilizing meteorological observations,recordofcotton planting areas,and cotton yield statistics from 1983 to 2021 in Shihezi,four machine learning models:K-Nearest Neighbor(KNN),Gradient Boosting(GBM),Random Forest(RF)and Back Propagation Neural Network(BPNN)were employed to construct the prediction framework forcotton yield.Model performances were assessed by using the root mean square error(RMSE),normalized root mean square error(NRMSE),model efficiency index(EF)and coefficient of determination(R2)as evaluative metrics.Findings revealed that,individually,Random Forest,Gradient Boosting and Back Propagation Neural Network models demonstrated the highest learning efficacy,while the K-Nearest Neighbor model showed relatively weaker predictive power.Ensemble models combining Random Forest,Gradient Boosting and Back Propagation Neural Network surpassed their individual counterparts in both learning capacity and predictive accuracy.These ensemble models achieved NRMSE values of 6.38%and 7.91%,RMSE scores of 109.41 and 114.67 kg·hm^(-2),with EF and R^(2) surpassing 0.95,indicating a high level of prediction accuracy and reliability.This ensemble model approach offers a fresh and efficacious methodology for forecasting cotton yields in the Shihezi cotton region.However,ongoing efforts are imperative to explore more rational model integration strategies and iteratively refine algorithms to enhance the precision and robustness of forecast outcomes further.Future wo

关 键 词:棉花 机器学习 产量预测 气候变化 

分 类 号:S165.27[农业科学—农业气象学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象