基于机器学习的水体化学需氧量高光谱反演模型对比研究被引量：6

A Comparative Study of the COD Hyperspectral Inversion Models in Water Based on the Maching Learning

作　　者：王春玲[1,2] 史锴源明星丛茂勤刘昕悦郭文记 WANG Chun-ling;SHI Kai-yuan;MING Xing;CONG Mao-qin;LIU Xin-yue;GUO Wen-ji(School of Information Science and Technology,Beijing Forestry University,Beijing 100083,China;Engineering Research Center for Forestry-oriented Intelligent Information Processing of National Forestry and Grassland Administration,Beijing 100083,China;Nanjing Institute of Software Technology,Institute of Software Chinese Academy of Sciences,Nanjing 210049,China)

机构地区：[1]北京林业大学信息学院,北京100083 [2]国家林业和草原局林业智能信息处理工程技术研究中心,北京100083 [3]中国科学院软件研究所南京软件技术研究院,江苏南京210049

出　　处：《光谱学与光谱分析》2022年第8期2353-2358,共6页Spectroscopy and Spectral Analysis

基　　金：中国科学院科研装备研制项目(YJKYYQ20170044);国家自然科学基金项目(61772078)资助。

摘　　要：化学需氧量(COD)是水体有机污染的一项重要指标,如何快速准确检测水体的COD含量尤为重要。机器学习在水质反演领域应用日益增多,并取得了较多的研究成果,高光谱遥感具有光谱空间分辨率高、成像通道多等优势,使其在水体COD反演方面有着极大的潜力。利用不同的高光谱预处理方法对原始高光谱数据进行处理,并利用处理前后的高光谱数据对比研究了不同机器学习模型、不同高光谱预处理方法对水体COD的反演性能。首先利用ZK-UVIR-I型原位光谱水质在线监测仪在扬州宝带河实地收集了1548组COD和对应的高光谱数据(400~1000 nm)样本,为降低光谱噪音干扰以及消除光谱散射影响,分别使用Savitzky-Golay(SG)平滑、多元散射校正数据(MSC)以及SG平滑结合MSC对原始光谱进行预处理。其次,将样本集随机划分为训练集和测试集,其中训练集占比80%,测试集占比20%。对预处理后的训练集全波段光谱基于线性回归、随机森林(random forest)、AdaBoost、XGBoost四种机器学习方法建立COD高光谱反演模型,并选取了决定系数(R)、均方根误差(RMSE)、相对分析误差(RPD)三种指标在测试集数据中评估高光谱反演模型的精度。结果表明,随机森林、AdaBoost、XGBoost均优于线性回归,无论光谱处理与否,通过XGBoost建立的反演模型预测能力均为最佳,其中使用XGBoost对经过SG平滑和MSC处理后的光谱数据进行建模的反演模型精度最高,其R达到0.92,RMSE为7.1 mg·L,RPD为3.4。考虑到原始光谱可能存在冗余,通过主成分分析法(PCA)对经过SG平滑和MSC处理后的光谱进行降维,并选取累计贡献率达到95%的前十个主成分作为模型的输入变量。通过XGBoost建立反演模型,结果表明经过PCA后的反演模型不仅精度有所上升,RPD达到3.8,而且模型的训练时间也由72 s缩短到2.9 s。以上研究可为该水域及类似水域的高光谱水质反演模型的建�Chemical oxygen demand(COD)is an important indicator of organic pollution in water.How to quickly and accurately test the COD content of water is particularly important.The application of machine learning in the field of water quality inversion is increasing,and more research results have been obtained.Hyperspectral remote sensing has the advantages of high spectral-spatial resolution and multiple imaging channels,so it has great potential in retrieving water’s COD.This study uses different hyperspectral pre-processing methods to process the original hyperspectral data.It uses the hyperspectral data before and after processing to compare the inversion performance of different machine learning models and different hyperspectral pre-processing methods on the COD content of water.Firstly,1548 groups of COD content and corresponding hyperspectral data(400~1000 nm)samples were collected by ZK-UVIR-I in-situ spectral water quality on-line monitor in Baodai River.In order to reduce the interference of spectral noise and eliminate the influence of spectral scattering,Savitzky-Golay(SG)smoothing,Multiplicative scatter correction(MSC)and SG smoothing combined with MSC methods were used to pre-process the original spectra.Secondly,the sample set is randomly divided into training set and test set,where the training set accounts for 80%and the test set accounts for 20%.A COD hyperspectral inversion model based on the four machine learning methods of linear regression,random forest(random forest),AdaBoost,and XGBoost was established for the pre-processed training set full-band spectrum.Moreover,three indexes of determination coefficient(R~2),root mean square error(RMSE)and relative analysis error(RPD)were selected to evaluate the accuracy of the hyperspectral inversion model.The results show that random forest,AdaBoost and XGboost are all the better than linear regression.The prediction ability of the inversion model established by XGboost is the best whether the spectral data is processed or not,with R~2 of 0.92,RMSE of 7.1

关键词：化学需氧量机器学习高光谱反演模型对比

分类号：O433.4[机械工程—光学工程]

参考文献：

正在载入数据...

二级参考文献：

正在载入数据...

耦合文献：

正在载入数据...

引证文献：

正在载入数据...

二级引证文献：

正在载入数据...

同被引文献：

正在载入数据...

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于机器学习的水体化学需氧量高光谱反演模型对比研究被引量：6

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

高级检索检索式检索

时间限定

期刊范围

学科限定全选

高级检索 检索式检索

时间限定

期刊范围

学科限定全选

基于机器学习的水体化学需氧量高光谱反演模型对比研究 被引量：6

我的收藏

参考文献：

二级参考文献：

耦合文献：

引证文献：

二级引证文献：

同被引文献：

相关期刊文献：

相关的主题

相关的作者对象

相关的机构对象

下载全文

用户登录

高级检索检索式检索

基于机器学习的水体化学需氧量高光谱反演模型对比研究被引量：6