基于时空组Lasso与分层贝叶斯时空模型的变量选择方法  被引量:1

Variable Selection Method based on Spatio-temporal Group Lasso and Hierarchical Bayesian Spatio-temporal Model

在线阅读下载全文

作  者:王玲[1,2,3] 康子豪 WANG Ling;KANG Zihao(School of Automation and Electrical Engineering,University of Science and Technology,Beijing 100083,China;Key Laboratory of Knowledge Automation of Industrial Process of Ministry of education,School of Automation and Electrical Engineering,University of Science and Technology,Beijing 100083,China;Shunde Innovation School,University of Science and Technology,Beijing 528399,China)

机构地区:[1]北京科技大学自动化学院,北京100083 [2]北京科技大学自动化学院工业过程知识自动化教育部重点实验室,北京100083 [3]北京科技大学顺德创新学院,北京528399

出  处:《地球信息科学学报》2023年第7期1312-1324,共13页Journal of Geo-information Science

基  金:国家自然科学基金项目(62076025、61572073);广东省基础与应用基础研究基金(2023A1515011320)。

摘  要:从高维度、大数据量的时空数据中有效选择变量是时空数据领域的重要问题之一,现有的时空数据变量选择的方法在变量选择的过程中未充分考虑时空相关性,时空变量选择阶段与预测阶段分开进行,且往往需要人为设定时空点个数阈值判定变量的取舍,从而无法较为准确的选择对因变量影响最大的变量子集,导致后续预测效果较差。本文针对上述不足,提出了一种基于时空组Lasso与分层贝叶斯时空模型的变量选择方法,称为分层贝叶斯时空组Lasso变量选择模型(Hierarchical Bayesian Spatio-temporal Group Lasso Variable Selection Method,HBST-GLVS),该方法首先利用时空组Lasso进行变量选择,通过引入最大时间滞后和最大空间邻域充分考虑时空相关性,并根据时空数据连续性,将同一时空变量的时空点进行整体惩罚,避免人为设定时空点个数引起局部片面性。然后,利用分层贝叶斯时空模型对变量选择的效果进行验证,将变量选择过程与模型验证过程置于同一框架下进行参数的调整,从而得到最优的变量子集。实验结果表明,与现有方法相比,本文方法在北京空气质量数据集、波特兰交通流数据集上的RMSE(Root Mean Square Error)和MAE(Mean Square Error)可分别降低9.6%~25.7%以及6.6%~15.9%。It is one of the important issues in the field of spatio-temporal data analysis to effectively select variables from high-dimensional and large-scale spatio-temporal data.As the most important features of spatiotemporal data,the temporal and spatial correlation of spatio-temporal data must be considered to make effective variable selection.However,existing spatio-temporal data variable selection methods do not fully consider the spatio-temporal correlation,and the variable selection stage is separated from the prediction stage.Moreover,these methods often require manual setting of a threshold of the number of spatiotemporal points to determine variables selection,which may lead to inaccurate selection of the subset of variables that have the greatest impact on the dependent variable,and result in poor prediction performance.In this paper,we propose a variable selection method based on the spatio-temporal group Lasso and the hierarchical Bayesian spatiotemporal model,called the hierarchical Bayesian Spatio-temporal Group Lasso Variable Selection method(HBST-GLVS).In this method,the spatio-temporal expansion is carried out simultaneously in the variable selection stage and prediction stage,and the best nearest neighbor time domain and space domain are determined adaptively through cross validation.In order to obtain the best prediction performance from the selection of variables,the selection of spatio-temporal variables and the prediction of spatiotemporal models are placed under the same framework,so that the selected variables and parameters correspond to the best prediction performance.In order to solve the problem of manual setting of the threshold of the number of spatio-temporal points,the variable selection is processed from the perspective of the entire sequence of spatio-temporal variables,without the threshold of the number of spatio-temporal points.Specifically,this method uses spatio-temporal group Lasso for variable selection,fully considers spatio-temporal correlation by introducing maximum time lag

关 键 词:时空数据 变量选择 时空相关性 时空组lasso 最大时间滞后 最大空间邻域 分层贝叶斯时空模型 

分 类 号:P208[天文地球—地图制图学与地理信息工程]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象