基于机器学习的新疆原棉品质预测模型研究  

Machine learning model for predicting cotton fiber quality in Xinjiang Region

在线阅读下载全文

作  者:陆永迪 李培松 郭郁 张启鹏 刘韬奋 王天合 杨明凤[2] 向导[2] 田景山[1] 张旺锋[1] LU Yongdi;LI Peisong;GUO Yu;ZHANG Qipeng;LIU Taofen;WANG Tianhe;YANG Mingfeng;XIANG Dao;TIAN Jingshan;ZHANG Wangfeng(Agricultural College/Key Laboratory of Oasis Ecological Agriculture of Xinjiang Production and Construction Corps,Shihezi University,Shihezi,Xinjiang 832003,China;Ulanwusu Agrometeorological Experiment Station,Shihezi Meteorological Bureau,Shihezi,Xinjiang 832003,China)

机构地区:[1]石河子大学农学院/新疆生产建设兵团绿洲生态农业重点实验室,新疆石河子832003 [2]石河子气象局乌兰乌苏农业气象试验站,新疆石河子832003

出  处:《石河子大学学报(自然科学版)》2025年第1期55-67,共13页Journal of Shihezi University(Natural Science)

基  金:国家自然科学基金项目(32060440);兵团财政科技计划项目(2023AB080);八师石河子市中青年科技创新领军人才计划项目(2023RC03)。

摘  要:新疆原棉品质存在显著的区域性差异,温、光等气候因素与其密切相关且影响较大,为明确气象因子与原棉品质的关系及科学预测原棉品质变化趋势和区域分布,本文开展基于机器学习算法的新疆原棉品质预测模型研究,寻求适宜原棉品质指标与气象因子的最优预测模型,对新疆棉花生产管理和品质提升具有重要意义。本研究利用2015—2022年新疆各植棉县(市)原棉品质公证检验数据和气象数据,通过大数据分析气象因素对纤维品质的相对贡献率及其关系;采用随机森林(Random Forest,RF)、支持向量回归算法(Support Vactor Regression,SVR) 2种机器学习算法构建气象因子与纤维品质的预测模型。结果表明,棉花生育期气象特征变量与原棉品质存在多重共线性,使用随机森林算法计算不同气象特征变量组合对原棉品质指标的方差解释率,选择方差解释率较高的气象特征变量组合作为模型输入变量,得到相应的原棉品质指标预测结果。与支持向量机模型相比,随机森林模型能准确地预测原棉纤维长度、断裂比强度、马克隆值和整齐度指数,预测精度均在88.59%以上,均方根误差(RMSE)为0.082 6~0.319 2。因此,随机森林算法能更好的随机选择最优样本训练集,在解决自变量多重共线性方面有较大的优势;随机森林算法对自变量进行特征选择会明显提高模型的准确性,用以预测原棉纤维长度、断裂比强度和整齐度指数的性能更好。There are significant regional differences in raw cotton quality in Xinjiang,and climate factors such as temperature and light are closely related to it and have great influence.In order to clarify the relationship between meteorological factors and raw cotton quality and scientifically predict the change trend and regional distribution of raw cotton quality,this paper studies the prediction model of raw cotton quality in Xinjiang based on machine learning algorithm,and seeks the optimal prediction model suitable for raw cotton quality indicators and meteorological factors,which will be of great significance for cotton production management and quality improvement in Xinjiang.In this study,the relative contribution rate of meteorological factors to fiber quality and their relationship were analyzed by using the notarization inspection data and meteorological data of raw cotton quality in cotton-growing counties(cities)in Xinjiang from 2015 to 2022.Two machine learning algorithms,Random Forest(RF)and Support Vactor Regression(SVR),are used to build the prediction model of meteorological factors and fiber quality.The results show that there are multiple collinearities between meteorological characteristic variables in cotton growth period and raw cotton quality.The random forest algorithm is used to calculate the variance explanatory rate of different meteorological characteristic variable combinations on raw cotton quality indexes,and the meteorological characteristic variable combinations with higher variance explanatory rate are selected as the model input variables,and the corresponding raw cotton quality index prediction results are obtained.Compared with the support vector machine model,the random forest model can accurately predict the fiber length,breaking tenacity,micronaire and uniformity index of raw cotton fiber,with the prediction accuracy above 88.59%and the root mean square error RMSE between 0.0826 and 0.3192.Therefore,the random forest algorithm can better randomly select the optimal sample trainin

关 键 词:原棉品质 预测模型 温度 特征选择 

分 类 号:S562[农业科学—作物学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象