Bagging偏最小二乘和Boosting偏最小二乘算法的金银花醇沉过程近红外光谱定量模型预测能力研究  被引量:14

A Study on Model Performance for Ethanol Precipitation Process of Lonicera japonica by NIR Based on Bagging-PLS and Boosting-PLS algorithm

在线阅读下载全文

作  者:陈昭[1] 吴志生[1] 史新元[1] 徐冰[1] 赵娜[1] 乔延江[1] 

机构地区:[1]北京中医药大学中药信息工程中心,北京100102

出  处:《分析化学》2014年第11期1679-1686,共8页Chinese Journal of Analytical Chemistry

基  金:国家自然科学基金项目(No.81303218);高等学校博士学科点专项科研基金(No.20130013120006)资助~~

摘  要:建立金银花醇沉过程中稳健的近红外光谱(Nearinfraredspectroscopy,NIR)定量模型,为金银花醇沉过程的快速评价提供方法。研究基于金银花醇沉过程绿原酸的NIR数据,通过建立Bagging偏最小二乘(Bagging-PLS)模型、Boosting偏最小二乘(Boosting-PLS)模型与偏最小二乘(PartialLeastSquares,PLS)模型,实现对模型性能比较;在此基础上,采用组合间隔偏最小二乘法(Synergyintervalpartialleastsquares,siPLS)和竞争自适应抽样(Competitiveadaptivereweightedsampling,CARS)法分别对光谱进行变量筛选,建立模型,实现了对模型预测性能的考察。实验结果表明,Bagging-PLS和Boosting-PLS(潜变量因子数设为10)的预测性能均优于PLS模型。在此基础上,两批样品采用siPLS筛选变量,第一个批次金银花筛选波段820-1029.5nm和1030-1239.5nm,第二个批次金银花醇沉筛选波段为820-959.5nm和960-1099.5nm;采用CARS方法变量筛选,两批样品分别选择5折交叉验证和10折交叉验证,取交叉验证均方根误差(RMSECV)值最小的子集作为最终变量筛选的结果。经过变量筛选的两批金银花醇沉过程中的绿原酸含量Bagging-PLS和Boosting-PLS模型的预测均方根误差(RMSEP)值降低了0.02-0.04g/L,预测相关系数提高了4%-5%。综上,Baggning-PLS和Boosting-PLS算法可作为金银花醇沉过程NIR定量模型的快速预测方法。To provide the methodology for rapid quality evaluation of Lonicera japonica,we have established the stable quantitative model of near infrared spectroscopy ( NIR) . The performance of Bagging partial least squares (Bagging-PLS) model and Boosting partial least squares (Boosting-PLS) model was compared with that partial least squares ( PLS ) model based on the NIR data of ethanol precipitation process of Lonicera japonica. On this basis, the performance of these two models after variables selection was also studied by the methods of siPLS ( synergy interval partial least squares ) and CARS ( competitive adaptive reweighted sampling) . The experimental results showed that the prediction performance of Bagging-PLS and Boosting-PLS models was superior to PLS model with the latent factor of 10 . The band of 820-1029 . 5 nm and 1030-1239. 5 nm for the first batch was selected by the method of siPLS. In addition, the band of 820-1029. 5 nm and 1030-1239. 5 nm was selected for the second batch sample in the same method. Furthermore, the method of CARS was taken to select variables for the two batches samples with 5-fold cross-validation and 10-fold cross-validation. And the lowest RMSECV( root mean square error of cross-validation) values were used to take subset. Compared to the model performance without the method of CARS, the RMSEP value of the Bagging-PLS model and Boosting-PLS model for the concentration of chlorogenic acid reduced by 0 . 02-0 . 04 g/L and rp(correlation coefficient of prediction)value increased by 4%-5%. Generally, Bagging-PLS and Boosting-PLS could be regarded as rapid prediction methodsfor NIR quantitative models of ethanol precipitation process of Lonicera japonica.

关 键 词:过程分析技术 金银花 醇沉 Bagging偏最小二乘算法 Boosting偏最小二乘算法 

分 类 号:O657.33[理学—分析化学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象