基于集成学习的高分辨率人口空间化模拟--以浙江省为例  

Simulation of High-resolution Population Spatial Distribution based on Ensemble Learning

在线阅读下载全文

作  者:吴心彤 高大伟[2] 李飞翔 姚晨明 赵乃卓 杨续超[1] Xintong WU;Dawei GAO;Feixiang LI;Chenming YAO;Naizhuo ZHAO;Xuchao YANG(Ocean College,Zhejiang University,Zhoushan 316021,China;Zhejiang Climate Center,Hangzhou 310052,China;Department of Land Resources Management,School of Humanities and Law,Northeastern University,Shenyang 110169,China)

机构地区:[1]浙江大学海洋学院,浙江舟山316021 [2]浙江省气候中心,浙江杭州310052 [3]东北大学文法学院土地资源管理系,辽宁沈阳110169

出  处:《遥感技术与应用》2024年第4期1013-1025,共13页Remote Sensing Technology and Application

基  金:国家自然科学基金项目(41971019)。

摘  要:高分辨率的人口空间化数据已成为重要决策中不可或缺的基础数据。文章旨在探索基于集成学习融合社会感知大数据和多源遥感数据进行人口空间化模拟的方法体系,选取浙江省为研究区,以腾讯位置大数据、兴趣点和遥感数据为数据源,使用神经网络、XGBoost和随机森林三种机器学习算法以及Stacking集成学习方法分别构建人口空间化模型并对比分析其模拟精度,获得2020年浙江省100 m分辨率的人口栅格数据。结果表明:①单个机器学习算法中,随机森林的模拟精度最高,Stacking集成学习策略具有良好泛化性能,有效缓解了单一模型的高值溢出现象,减少了模拟误差;②集成学习人口网格中浙江省人口高值集中于城市中心区域,峰值约为500人/网格,人口密度随城市中心距离增大逐级递减;③与WorldPop人口数据集相较,集成学习人口数据集在城市中心人口模拟、数据完整性等方面有显著优势。Readily available and accurate maps of population distribution are of critical importance in decision-making.In this study,a new methodology based on ensemble learning technology is introduced that leverages geospatial big data and multi-source remote sensing data for high-resolution and high precision population mapping.Population predictor variables were extracted from Tencent location big data,points of interest and remote sensing data.Using three individual machine learning algorithms(i.e.XGBoost,neural network,and random forest)and the Stacking ensemble learning method,four population prediction models were established to disaggregate the 2020 census population data of Zhejiang Province to grids with 100 m resolution.The results show that:(1)Among three machine learning algorithms,random forest has the best prediction performance.Compared to individual machine learning algorithms,the Stacking ensemble learning strategy has good generalization performance,alleviates the high-value overflow issue,and reduces prediction errors;(2)The results from the ensemble show that the high population density in Zhejiang Province located in the city's core region,with a peak value of 500 people/grid.Population density decreases in steps with increasing distance from urban centers;(3)The gridded population data from the stacking ensemble outperform the WorldPop dataset in terms of higher population density in urban centers and data integrity.This study provides new methods and technical means for rapidly and accurately population mapping in the era of big data.

关 键 词:人口数据空间化 集成学习 神经网络模型 XGBoost模型 随机森林模型 

分 类 号:TP79[自动化与计算机技术—检测技术与自动化装置]

 

参考文献:

正在载入数据...

 

二级参考文献:

正在载入数据...

 

耦合文献:

正在载入数据...

 

引证文献:

正在载入数据...

 

二级引证文献:

正在载入数据...

 

同被引文献:

正在载入数据...

 

相关期刊文献:

正在载入数据...

相关的主题
相关的作者对象
相关的机构对象