机构地区:[1]上海烟草集团有限责任公司技术中心,上海200082 [2]上海大学化学系,上海200444
出 处:《光谱学与光谱分析》2021年第2期473-477,共5页Spectroscopy and Spectral Analysis
基 金:中国烟草总公司科技重大专项(中烟办[2016]259);国家自然科学基金青年基金项目(21706156)资助。
摘 要:烤烟香型的判别一直是烟草行业的关注焦点。利用中红外和近红外光谱对189份不同香型的烟叶进行分析。分别从中红外谱图数据中提取21个特征波数处以及近红外谱图数据中13个特征波数处的吸光值作为影响因素。通过主成分分析方法分别对选取的中红外、近红外数据进行烟叶清香型、中间香型和浓香型三种香型风格的定性分析。结果表明基于中红外和近红外数据PCA投影图中三种香型混淆严重,区分界面不清晰。随后,将中红外、近红外数据进行融合,将提取的34个特征波数处的吸光值同时代入主成分分析,得到基于中红外和近红外融合数据的PCA投影图。该投影图可以将不同香型的烟叶明显地区分出来。随后利用后退法和遗传算法对中红外和近红外融合后的34个吸光度值进行变量选择,后退法选择出了24个变量,遗传算法选择出了19个变量。对比34,24和19个变量的烟叶三种香型风格的主成分投影图,遗传算法虽然选择了比较少的变量,但其仍然可以将烟叶进行准确的分类。利用遗传算法对中红外和近红外融合后数据进行变量选择,剔除对烟叶香型分类影响小的因素。最后,利用支持向量机建立烟叶清香型、中间香型和浓香型分类判别模型。该模型的建模结果准确率为92.72%,其中清香型、中间香型和浓香型的准确率分别为93.75%,92.11%和91.84%。内部交叉验证留一法结果准确率为88.74%,其中清香型、中间香型和浓香型的准确率分别为90.63%,86.84%和87.76%。对未知样本预报结果的准确率为86.84%,其中清香型、中间香型和浓香型的准确率分别为88.24%,85.71%和85.71%。无论是建模结果、留一法结果和预报结果其准确率都大于85%。研究结果表明中红外和近红外数据融合可以提供更多的特征信息,利用这些信息可以建立烟叶香型风格的分类判别模型,为烟叶香型风格快速鉴别提�Tobaccos flavor type’s determination is an important field tobacco industry.In this work,189 tobacco samples with different flavor were tested by middle infrared(MIR)spectrum and near-infrared(NIR)spectrum.After the test,21 characteristic absorption value from a certain wavelength in the MIR spectrum and 13 characteristic absorption value from a certain wavelengthin the IR spectrum were selected as main variants.Then the characteristic data extracted from MIR and IR spectrum were submitted to the principal component analysis(PCA),respectively.The PCA pattern showed a poor classification result by using MIR and IR data solely.After that,the MIR and IR variants were submitted to PCA analysis as merged data.The PCA pattern calculated from merged data showed a good classification result.Through the data analysis,there different flavor Style(fen-flavor Style,medium flavor Style and robust flavor Style)can be classified clearly into their category.After PCA analysis,different mathematical algorithms as step-back algorithm and genetic algorithm were applied to select 34 variants that used in PCA model.24 variants and 19 variants were selected by step-back algorithms and genetic algorithms,respectively.Compared to the projection pattern by using different variant selected by a different algorithm,we found that though the genetic algorithms used the least variants,the classification result is as good as PCA algorithms and step-back algorithms.After that,genetic algorithms were chosen to make projection drawing that separated three different flavors into different planes by using least variants chosen from MIR and IR merged data.Finally,a support vector classification(SVC)model was built to determine different tobacco flavor by using the variants selected by the genetic algorithm.The accuracy of the model was 92.72%,the accuracy in discriminating fen-flavorstyle,medium flavorstyle and robust flavorstyle were 93.75%,92.11%and 91.84%.The accuracy of predicted outputs was tested by the leave-one-out cross validation(LOOCV).
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...