Abstract
High-resolution population datasets have been lever-aged across a broad swath of domains, such as climate change, public policy, humanitarian aid, and rescue operations, among others. Machine learning methods were adopted to generate high-resolution or gridded population estimates by using various geospatial input features such as buildings, roads, and nighttime lights. In this study, we evaluate the importance of population features using Random Forest models across three levels of analysis, utilizing permutation measures. Our research aims to address key questions to enhance our understanding of high-resolution population modeling, such as: Are certain features globally (10 countries collectively) more important than others? Do optimal features vary by country? Within each country, do feature importance differ across administrative units? What similarities exist in feature importance at the global, country, and administrative unit levels? To answer these questions, we leverage the Kneedle algorithm to automate the selection of optimum features. We find that there are patterns displayed by features across spatial boundaries, evidenced by the same feature being the most important indicator of population across 7 of the 10 countries modeled. Our findings indicate that while important features may vary across geographies, certain features consistently hold greater importance than others agnostic of geography.
| Original language | English |
|---|---|
| Title of host publication | Proceedings - 2024 International Conference on Machine Learning and Applications, ICMLA 2024 |
| Editors | M. Arif Wani, Plamen Angelov, Feng Luo, Mitsunori Ogihara, Xintao Wu, Radu-Emil Precup, Ramin Ramezani, Xiaowei Gu |
| Publisher | Institute of Electrical and Electronics Engineers Inc. |
| Pages | 1844-1851 |
| Number of pages | 8 |
| ISBN (Electronic) | 9798350374889 |
| DOIs | |
| State | Published - 2024 |
| Event | 23rd IEEE International Conference on Machine Learning and Applications, ICMLA 2024 - Miami, United States Duration: Dec 18 2024 → Dec 20 2024 |
Publication series
| Name | Proceedings - 2024 International Conference on Machine Learning and Applications, ICMLA 2024 |
|---|
Conference
| Conference | 23rd IEEE International Conference on Machine Learning and Applications, ICMLA 2024 |
|---|---|
| Country/Territory | United States |
| City | Miami |
| Period | 12/18/24 → 12/20/24 |
Funding
This research is part of the LandScan Program which is funded by the United States Department of Defense (DoD).
Keywords
- feature selection
- gridded population modeling
- kneedle algorithm
- machine learning
- population features
- random forest