Abstract
Rising sea levels due to climate change increasingly threaten medical infrastructure through flooding. This study develops machine learning models to predict flood exposure for 11,508 medical facilities in the southeastern coastal regions of the United States by integrating datasets including meteorological, hydrological, topographic, and geological data, the Natural Risk Index, and historical flood records from NASA, HIFLD, and FEMA. Six regression models, namely Linear Regression, Support Vector Regression, Random Forest, k-Nearest Neighbors, XGBoost, and Artificial Neural Networks, are trained using 16 explanatory variables identified through literature review and correlation analysis. Data preprocessing employs the SMOGN for class imbalance and Winsorization for outliers. Model performance is evaluated using MAE, MSE, and RMSE, with Random Forest and XGBoost models achieving the highest performance (MSE of 2.58e-5 and 3.69e-5, respectively). This multifactorial approach allows the models to capture complex flood-influencing relationships, enhancing adaptability and performance across geographic regions. Future work focuses on expanding across the U.S. and developing a near real-time flood monitoring system.
| Original language | English |
|---|---|
| Title of host publication | Proceedings - 2025 IEEE International Conference on Big Data and Smart Computing, BigComp 2025 |
| Publisher | Institute of Electrical and Electronics Engineers Inc. |
| Pages | 49-56 |
| Number of pages | 8 |
| Edition | 2025 |
| ISBN (Electronic) | 9798331529024 |
| DOIs | |
| State | Published - 2025 |
| Externally published | Yes |
| Event | 2025 IEEE International Conference on Big Data and Smart Computing, BigComp 2025 - Kota Kinabalu, Malaysia Duration: Feb 9 2025 → Feb 12 2025 |
Conference
| Conference | 2025 IEEE International Conference on Big Data and Smart Computing, BigComp 2025 |
|---|---|
| Country/Territory | Malaysia |
| City | Kota Kinabalu |
| Period | 02/9/25 → 02/12/25 |
Funding
We want to thank Ishan Mukherjee and Megan Yu for their early contributions as teammates during the Xinformatics and Data Science courses at Rensselaer Polytechnic Institute (RPI). We also express our gratitude to the Institute for Data Exploration and Applications (IDEA) at RPI and Dr. James Hendler for his generous support for this project, as well as to the School of Science and the School of Architecture.
Keywords
- flooded fraction
- flooding disaster
- flooding prediction
- machine learning
- medical infrastructure