机构地区:[1]国能常州发电有限公司,江苏常州213001 [2]新型电力系统运行与控制全国重点实验室,清华大学能源与动力工程系,教育部低碳清洁能源创新国际合作联合实验室,北京100084 [3]清华大学山西清洁能源研究院,山西太原030032 [4]南京国电环保科技有限公司,江苏南京210061
出 处:《光谱学与光谱分析》2024年第7期1940-1945,共6页Spectroscopy and Spectral Analysis
基 金:国家自然科学基金项目(51906124)资助。
摘 要:激光诱导击穿光谱(LIBS)是一项新兴的原子光谱分析技术,具有无需复杂样品制备,快速、原位、多元素同时测量等优点,在煤质分析领域展现出良好的应用前景。近年来,化学计量学和机器学习模型被广泛用于煤质分析。而这些模型通常依赖于一定数量的训练样本来确保分析结果的精度和可靠性。由于获取煤样的真实的成分含量信息(标签)需要复杂、耗时的化学分析,训练样本数量不足,导致模型性能欠佳。针对小样本情况下基于LIBS技术的煤质分析,提出了多模型集成的半监督学习方法提升定量分析性能。首先根据初始训练集建立5个基线模型,包括多元线性回归(MLR)、偏最小二乘回归(PLSR)、局部加权偏最小二乘回归(LW-PLSR)、支持向量回归(SVR)、核极限学习机(K-ELM);利用5个模型处理无标签数据,得到5组预测值;对于每个无标签样本,计算5个预测值的标准差,并将最小标准差对应的无标签样本加入训练集,其伪标签为5个预测值的平均值;通过迭代循环来扩充训练集,并更新、优化训练模型;最后对测试样本进行分析。提出的方法在LIBS煤质分析数据集上进行了测试,包含20个训练样本、39个测试样本、280个无标签样本。结果表明,提出的半监督学习方法将固定碳、灰分、挥发分含量的预测拟合系数(R^(2))分别提高了0.033、0.102和0.118。在训练样本数量不足的条件下,半监督学习能够有效提升了LIBS定量化模型的准确度和可靠性。Laser-induced breakdown spectroscopy(LIBS)is an emerging atomic spectroscopy technique that has the advantages of low sample pre-treatment and rapid,in situ,and simultaneous multi-element measurements.LIBS demonstrates good prospects in the field of coal analysis.In recent years,chemometric and machine learning models have been widely used to improve the quantitative accuracy of LIBS in coal analysis.Generally,these models rely on a certain number of training samples to ensure the reliability of the prediction results.However,obtaining the certified content(label information)of coal samples used for model training requires traditional chemical analysis,which is complex and time-consuming.This may lead to insufficient training samples and poor model performance.To tackle the small sample problem in LIBS-based coal analysis,this work proposes a semi-supervised learning method based on the ensemble of multiple models.5 baseline models are first established based on the initial training set,including multiple linear regression(MLR),partial least squares regression(PLSR),locally weighted partial least squares regression(LW-PLSR),support vector regression(SVR),and kernel extreme learning machine(K-ELM).The unlabelled data are processed using the 5 models,and 5 prediction values are obtained.For each unlabelled sample,the standard deviation of the 5 prediction values is calculated,and the unlabelled sample corresponding to the smallest standard deviation is added to the training set.Its pseudo label is the average of the 5 prediction values.As the training set is iteratively expanded,its corresponding training model is updated.The final training model is optimized and used to analyse the test samples.The proposed method is tested on a coal dataset containing 20 training samples,39 test samples and 280 unlabelled samples.The results show that the proposed method improves the coefficient of determination(R^(2))for content prediction of fixed carbon,ash,and volatile by 0.033,0.102 and 0.118,respectively.Therefore,if the nu
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...
正在载入数据...