基于混合式特征选择的高分五号影像农田识别  被引量:4

Hybrid feature selection for cropland identification using GF-5 satellite image

在线阅读下载全文

作  者:陈珠琳 贾坤 李强子[3] 肖晨超 魏丹丹 赵祥 魏香琴 姚云军[1,2] 李娟 CHEN Zhulin;JIA Kun;LI Qiangzi;XIAO Chenchao;WEI Dandan;ZHAO Xiang;WEI Xiangqin;YAO Yunjun;LI Juan(State Key Laboratory of Remote Sensing,Faculty of Geographical Science,Beijing Normal University,Beijing 100875,China;Beijing Engineering Research Center for Global Land Remote Sensing Products,Beijing Normal University,Beijing 100875,China;Aerospace Information Research Institute,Chinese Academy of Sciences,Beijing 100101,China;Land Satellite Remote Sensing Application Center,Ministry of Natural Resource of the People’s Republic of China,Beijing 100048,China)

机构地区:[1]北京师范大学地理科学学部遥感科学国家重点实验室,北京100875 [2]北京师范大学北京市陆表遥感数据产品工程技术研究中心,北京100875 [3]中国科学院空天信息创新研究院,北京100101 [4]自然资源部国土卫星遥感应用中心,北京100048

出  处:《遥感学报》2022年第7期1383-1394,共12页NATIONAL REMOTE SENSING BULLETIN

基  金:国家重点研发计划(编号:2019YFE0127300,2016YFB0501404);国家自然科学基金(编号:42171318)。

摘  要:精准农田识别是农作物估产和粮食安全评估的基础。遥感数据作为农田识别的重要数据源,可提供动态、快速的监测结果。高光谱数据在农田识别分类方面具有巨大的应用潜力,但其中的冗余波段影响了分类效率和分类精度。因此,本研究提出了一种适用于高光谱数据农田分类的混合式特征选择算法。首先,基于变量的重要性排序或约束程度,按步长逐步进行降维;其次,寻找分类精度骤减的转折点,并将其对应的变量作为特征子集;最后,利用序列后向选择SBS(Sequential Backward Selection)方法搜索最优分类特征子集。本研究利用GF-5高光谱数据,共研究了3种降维方法(随机森林RF(Random Forest)、互信息MI(Multi-Information)和L1正则化(L1 regularization))和3种分类算法(随机森林、支持向量机SVM(Support Vector Machine)和K近邻KNN(K-Nearest Neighbor))的组合在农田分类中的表现。结果表明,基于L1正则化法得到的特征子集自相关性较低,并且包含的红边和近红外波段有效提高了农田、森林和裸土的区分度。在不同分类模型比较中发现,SVM在高维空间中表现出非常好的抗噪能力,分类精度高于RF和KNN。而RF在低维空间中的泛化能力要高于SVM和KNN。相比于第一步降维得到的特征子集,使用SBS搜索得到的最优特征子集均提高了分类精度。最终,具有23维输入的L1-SVM-SBS分类模型得到了最高的总体分类精度(94.64%)和农田召回率(95.83%)。本研究为高光谱数据特征优选提供了一种新思路,筛选出了更具代表性的特征波段,提高了农田分类精度,对高光谱遥感分类研究具有参考价值。Accurate farmland area identification is the basis of crop yield estimation and an important indicator in food security assessment.As an important data source for farmland identification,remote sensing data can provide dynamic and fast observation results for classification.GF-5,which is the only hyperspectral satellite in the China High-resolution Earth Observation System,has great research and application potential in farmland identification.However,the dimensionality curse caused by the redundant bands in hyperspectral data seriously affects the calculation speed and classification accuracy of models.To solve this problem,this research proposes a hybrid feature selection algorithm for farmland identification.First,on the basis of the feature importance provided by the feature selection algorithm,the feature dimension is gradually reduced from 295 to 5 with a step length of 10.The overall accuracy of the classification results corresponding to each feature dimension is recorded.Second,the turning point(a dimension number whose corresponding overall accuracy hardly decreases when the input variable number is smaller than it)is determined based on the overall accuracy,and the corresponding variables are adopted as the feature subset.Lastly,the Sequential Backward Selection(SBS)method is used to search for the best subset.Three feature selection algorithms(i.e.,Random Forest(RF),Multi-Information(MI),and L1 regularization(L1))and three classification algorithms(RF,Support Vector Machine(SVM),and K-Nearest Neighbor(KNN))are examined.Results indicate that the autocorrelations of the three subsets differ significantly.Most of the bands selected by the MI method are continuous and concentrated in the blue and shortwave infrared range.Therefore,the extremely high autocorrelation that exists in this subset has a negative effect on classification accuracy.By contrast,the correlation between bands in the RF and L1 feature subsets is relatively weak.However,the two feature sets still result in different classification accu

关 键 词:农田识别 高分五号 特征选择 高光谱遥感 L1正则化 后向序列选择 

分 类 号:S127[农业科学—农业基础科学]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象