核映射和Rank-Order距离的局部保持投影相似性度量方法  被引量:3

Local Preserving Projection Similarity Measure Method Based on Kernel Mapping and Rank-Order Distance

在线阅读下载全文

作  者:秦玉华[1] 张萌 杨宁 单秋甫 QIN Yu-hua;ZHANG Meng;YANG Ning;SHAN Qiu-fu(College of Information Science and Technology,Qingdao University of Science and Technology,Qingdao 266061,China;Qingdao Lanzhi Modern Service Industry Digital Engineering Research Center,Qingdao 266071,China;China Tobacco Yunnan Industrial Co.,Ltd.,Technical Research Center,Kunming 650024,China)

机构地区:[1]青岛科技大学信息科学技术学院,山东青岛266061 [2]青岛蓝智现代服务业数字工程技术研究中心,山东青岛266071 [3]云南中烟工业有限责任公司技术中心,云南昆明650024

出  处:《光谱学与光谱分析》2021年第10期3117-3122,共6页Spectroscopy and Spectral Analysis

基  金:国家重点研发计划项目(2018YFB1701704);云南中烟工业有限责任公司项目(2019XX02)资助。

摘  要:针对近红外光谱高维、高冗余、非线性和小样本等特点导致光谱相似性度量时出现的“维度灾难”,提出一种基于核映射和rank-order距离的局部保持投影(KRLPP)算法。首先将光谱数据经过核变换映射到更高维空间,有效保证了流形结构的非线性特征。然后改进局部保持投影(LPP)算法对数据进行降维操作,将rank-order距离替代传统的欧氏距离或测地线距离,通过共享邻近点的信息,得到更加准确的局部邻域关系。最后在低维空间通过距离的计算实现光谱的度量。该方法不仅有效解决了高维空间存在的“距离失效”问题,同时还提高了相似性度量结果的精度。为了验证KRLPP算法的有效性,首先根据降维前后数据集信息残差的变化确定了最佳参数近邻点的个数k和降维后的维数d。其次,从光谱降维投影效果和模型分类效果两个角度与PCA,LPP和INLPP算法进行了对比,结果表明KRLPP算法对于烟叶的部位有较好的区分能力,降维效果以及对于不同部位的正确识别率明显优于PCA,LPP和INLPP。最后,从某品牌卷烟叶组配方中选取了5个代表性烟叶作为目标烟叶,分别采用PCA,LPP和KRLPP方法从300个用于配方维护的烟叶样品中为每个目标烟叶寻找相似烟叶,并从化学成分和感官评价两方面对替换前后的烟叶及叶组配方进行了评价分析。其中LPP和KRLPP用于降维的参数选择保持一致,PCA选择前6个主成分。结果表明,由KRLPP选出的替换烟叶与替换配方在总糖、还原糖、总烟碱、总氮等化学成分以及香气、烟气、口感等感官指标上较PCA、LPP方法差异最小,相似性度量准确度最高。该方法可应用于配方产品替换原料的查找,辅助企业实现产品质量的维护。Aiming at the curse of dimensionality problem in measuring spectral similarity caused by the high dimensionality,high redundancy,non-linearity and small samples of the near-infrared spectrum,a local preserving projection algorithm based on kernel mapping and rank-order distance(KRLPP)is proposed in this paper.First,the spectral data is mapped to a higher-dimensional space through a kernel transformation,which effectively ensures the manifold structure’s nonlinear characteristics.Then,the dimensionality of the data is reduced by the locality preserving projections(LPP)algorithm,the rank-order distance is introduced instead of the traditional Euclidean distance or geodesic distance,and a more accurate local neighborhood relationship can be obtained by sharing the information of neighboring points.Finally,the measurement of the spectrum is realized by calculating the distance in low-dimensional space.This method solves the problem of distance failure in high-dimensional space and improves the accuracy of similarity measurement results.In order to verify the effectiveness of the KRLPP algorithm,firstly,the best parameters including the number k of the nearest neighbors and the dimensionality d of the reduced space were determined according to the residuals variation of the dataset before and after dimension reduction.Secondly,it compared with PCA,LPP,and INLPP algorithms from the perspectives of the projection effect of the spectra dimension reduction and the model classification ability.The results show that the KRLPP algorithm has a better ability to distinguish tobacco positions,and the effects of dimension reduction and correct identification of different tobacco positions are significantly better than PCA,LPP and INLPP methods.Finally,five representative tobacco were selected as target tobacco from a certain brand of cigarette formula.At the same time,PCA,LPP and KRLPP methods were used to find similar tobacco for each target tobacco from 300 tobacco samples used for formula maintenance,and the tobacco and cig

关 键 词:近红外光谱 局部保持投影算法 核映射 rank-order距离 相似性度量 

分 类 号:O657.33[理学—分析化学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象