At Risk Population Estimates for Belarus, Poland and Slovakia with Machine Learning

Viswadeep Lebakula, Clinton Stipek, Daniel Adams, Justin Epting, Marie Urban

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

High-resolution gridded population modeling is crucial for various applications, including disaster response planning, infectious disease spread modeling, climate change impact estimation, policy development, and more. Multiple gridded population datasets have been developed, each tailored to meet specific objectives. Among them, LandScan Global dataset is designed to represent ambient and unwarned population distributions. However, this dataset relies on a statistical approach that requires manual adjustments, making it time consuming and labour intensive. Existing machine learning (ML) methods often train and test at different spatial resolutions, potentially leading to inflated results, and they rely on Census population totals for disaggregation. To address these limitations, in this study we developed population estimates using ML models trained and tested at a consistent 30 arc-second resolution (≈1 square kilometer), specifically using Random Forest (RF) and XGBoost. These models were trained on 2020 datum to predict for 2021 for three countries: Belarus, Poland, and Slovakia. Our findings show that both RF (MAE varies from 5.75 to 13.25) and XGBoost (MAE varies from 8.15 to 23.44) model performance is close to LandScan Global estimates. Furthermore, neither of the models performed the best across all grid cells: the RF model was more effective in areas with lower populations, while XGBoost excelled in more densely populated regions. The proposed approach can be used for countries where the Census data is not available.

Original languageEnglish
Title of host publicationProceedings - 2024 IEEE International Conference on Big Data, BigData 2024
EditorsWei Ding, Chang-Tien Lu, Fusheng Wang, Liping Di, Kesheng Wu, Jun Huan, Raghu Nambiar, Jundong Li, Filip Ilievski, Ricardo Baeza-Yates, Xiaohua Hu
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages5804-5811
Number of pages8
ISBN (Electronic)9798350362480
DOIs
StatePublished - 2024
Event2024 IEEE International Conference on Big Data, BigData 2024 - Washington, United States
Duration: Dec 15 2024Dec 18 2024

Publication series

NameProceedings - 2024 IEEE International Conference on Big Data, BigData 2024

Conference

Conference2024 IEEE International Conference on Big Data, BigData 2024
Country/TerritoryUnited States
CityWashington
Period12/15/2412/18/24

Funding

This research is a part of LandScan program which is funded by the United States Department of Defense.

Keywords

  • At risk populations
  • Belarus
  • Ensemble approach
  • High resolution population modeling
  • Poland
  • and Slovakia

Fingerprint

Dive into the research topics of 'At Risk Population Estimates for Belarus, Poland and Slovakia with Machine Learning'. Together they form a unique fingerprint.

Cite this