检索规则说明:AND代表“并且”;OR代表“或者”;NOT代表“不包含”;(注意必须大写,运算符两边需空一格)
检 索 范 例 :范例一: (K=图书馆学 OR K=情报学) AND A=范并思 范例二:J=计算机应用与软件 AND (U=C++ OR U=Basic) NOT M=Visual
机构地区:[1]桂林理工大学理学院,桂林541004 [2]上海优久生物科技有限公司,上海201600
出 处:《农业机械学报》2015年第5期233-238,共6页Transactions of the Chinese Society for Agricultural Machinery
基 金:国家自然科学基金资助项目(11226219;61164020);广西自然科学基金资助项目(2014GXNSFBA118023)
摘 要:基于近红外(NIR)光谱技术,采用随机森林(RF)回归方法测定饲料鱼粉的蛋白含量。考虑到RF模型的随机性,通过调试决策树数量(ntree)和分裂变量数目(nsv)来进行模型优选;利用基尼系数(G)的下降量来判断近红外波长变量的建模重要性,进而为鱼粉蛋白的NIR分析优选信息波长,以提高NIR定量分析精度。根据统计学原理,选择具有较低计算复杂度的等效最优模型。优选的RF模型构建471个决策树,需要随机的103个波长变量进行树节点分裂,同时通过计算节点分裂前后G的平均下降量来选择52个近红外信息波长进行定标校正,得到等效最优的校正模型,校正均方根偏差和校正相关系数分别为3.970%和0.943;经过独立的预测集样品对最优RF模型进行检验,预测均方根偏差为5.271%,预测相关系数为0.906,说明RF回归结合G系数的波长优选能够有效地提高NIR光谱应用于鱼粉蛋白定量的预测能力。Random forest (RF) regression algorithm was utilized for determination of protein content in fishmeal samples based on near-infrared (NIR) spectrometry. Considering the randomness of RF method, the optimized models were selected by tuning the two vital modeling parameters of the number of decision trees ( ntree ) and the number of split variables (nsv). The descending of Gini coefficient (G) is taken as the indicator performing the modeling importance of NIR valuables. It was used to select the informative wavelengths for NIR analysis of fishmeal, with an aim to improve the accuracy of quantitative models. According to statistical theory, we tried to select equivalent optimal model with relatively low computational complexity. The optimized RF model needed to construct 471 decision trees and randomly select 103 wavelength variables for node splitting when the decision trees grow. Simultaneously, 52 NIR informative wavelengths can be selected out according to the average of G descending values based on the trees in the forest. The equivalent optimized RF model output the root mean square error (RMSEv) and correlation coefficient ( Rv ) of validation set were 3.970% and 0. 943, respectively. The optimized model was further evaluated by using the prediction samples that were excluded from modeling process, with the RMSEp of 5.271% , and the Rp of 0. 906. Results showed that RF regression combined with G coefficients for wavelength selection is feasible and effective to improve the NIR predictive ability for quantitative determination of fishmeal protein.
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在链接到云南高校图书馆文献保障联盟下载...
云南高校图书馆联盟文献共享服务平台 版权所有©
您的IP:216.73.216.42