Multiple Data Imputation Methods Advance Risk Analysis and Treatability of Co-occurring Inorganic Chemicals in Groundwater

Akhlak U. Mahmood, Minhazul Islam, Alexey V. Gulyuk, Emily Briese, Carmen A. Velasco, Mohit Malu, Naushita Sharma, Andreas Spanias, Yaroslava G. Yingling, Paul Westerhoff

Research output: Contribution to journalArticlepeer-review

Abstract

Accurately assessing and managing risks associated with inorganic pollutants in groundwater is imperative. Historic water quality databases are often sparse due to rationale or financial budgets for sample collection and analysis, posing challenges in evaluating exposure or water treatment effectiveness. We utilized and compared two advanced multiple data imputation techniques, AMELIA and MICE algorithms, to fill gaps in sparse groundwater quality data sets. AMELIA outperformed MICE in handling missing values, as MICE tended to overestimate certain values, resulting in more outliers. Field data sets revealed that 75% to 80% of samples exhibited no co-occurring regulated pollutants surpassing MCL values, whereas imputed values showed only 15% to 55% of the samples posed no health risks. Imputed data unveiled a significant increase, ranging from 2 to 5 times, in the number of sampling locations predicted to potentially exceed health-based limits and identified samples where 2 to 6 co-occurring chemicals may occur and surpass health-based levels. Linking imputed data to sampling locations can pinpoint potential hotspots of elevated chemical levels and guide optimal resource allocation for additional field sampling and chemical analysis. With this approach, further analysis of complete data sets allows state agencies authorized to conduct groundwater monitoring, often with limited financial resources, to prioritize sampling locations and chemicals to be tested. Given existing data and time constraints, it is crucial to identify the most strategic use of the available resources to address data gaps effectively. This work establishes a framework to enhance the beneficial impact of funding groundwater data collection by reducing uncertainty in prioritizing future sampling locations and chemical analyses.

Original languageEnglish
JournalEnvironmental Science and Technology
DOIs
StateAccepted/In press - 2024
Externally publishedYes

Funding

This work was supported by the Science and Technologies for Phosphorus Sustainability (STEPS) Center, a National Science Foundation Science and Technology Center (CBET-2019435), and The National Institute of Environmental Health Sciences through the Metals and metal mixtures: Cognitive aging, remediation, and exposure sources (MEMCARE) center (#P42ES030990). We acknowledge discussions with state regulatory agencies in Arizona regarding, in the face of limited financial resources for monitoring, the importance of reducing the uncertainty and prioritizing locations for groundwater sampling and analysis. Laurel Passantino provided technical editing.

FundersFunder number
National Institute of Environmental Health Sciences
MEMCARE
BioXFEL Science and Technology CenterCBET-2019435
BioXFEL Science and Technology Center

    Keywords

    • chemicals
    • contaminants
    • drinking water
    • pollutants
    • statistics

    Fingerprint

    Dive into the research topics of 'Multiple Data Imputation Methods Advance Risk Analysis and Treatability of Co-occurring Inorganic Chemicals in Groundwater'. Together they form a unique fingerprint.

    Cite this