Enhanced climate reproducibility testing with false discovery rate correction

Research output: Contribution to journalArticlepeer-review

Abstract

Simulating the Earth's climate is an important and complex problem, thus climate models are similarly complex, comprised of millions of lines of code. In order to appropriately utilize the latest computational and software infrastructure advancements in Earth system models running on modern hybrid computing architectures to improve their performance, precision, accuracy, or all three; it is important to ensure that model simulations are repeatable and robust. This introduces the need for establishing statistical or non-bit-for-bit reproducibility, since bit-for-bit reproducibility may not always be achievable. Here, we propose a short-simulation ensemble-based test for an atmosphere model to evaluate the null hypothesis that modified model results are statistically equivalent to that of the original model. We implement this test in version 2 of the US Department of Energy's Energy Exascale Earth System Model (E3SM). The test evaluates a standard set of output variables across the two simulation ensembles and uses a false discovery rate correction to account for multiple testing. The false positive rates of the test are examined using re-sampling techniques on large simulation ensembles and are found to be lower than the currently implemented bootstrapping-based testing approach in E3SM. We also evaluate the statistical power of the test using perturbed simulation ensemble suites, each with a progressively larger magnitude of change to a tuning parameter. The new test is generally found to exhibit more statistical power than the current approach, being able to detect smaller changes in parameter values with higher confidence.

Original languageEnglish
Pages (from-to)23-39
Number of pages17
JournalEarth System Dynamics
Volume17
Issue number1
DOIs
StatePublished - Jan 6 2026

Funding

This research was supported as part of the Energy Exascale Earth System Model (E3SM) project, funded by the U.S. Department of Energy, Office of Science, Office of Biological and Environmental Research. This research was supported as part of the Energy Exascale Earth System Model (E3SM) project, funded by the U.S. Department of Energy, Office of Science, Office of Biological and Environmental Research. The authors also gratefully acknowledge the computing resources provided on Blues, a high-performance computing cluster operated by the Laboratory Computing Resource Center at Argonne National Laboratory. This research used resources of the Oak Ridge Leadership Computing Facility, which is a DOE Office of Science User Facility supported under Contract DE-AC05-00OR22725. The authors also acknowledge the numerous open-source libraries on which this work depends, , , , , , , , . The authors also acknowledge the seven anonymous reviewers, their comments have made this a more robust investigation. This research was supported as part of the Energy Exascale Earth System Model (E3SM) project, funded by theU.S. Department of Energy, Office of Science, Office of Biologicaland Environmental Research. This research was supported as part of theEnergy Exascale Earth System Model (E3SM) project, funded bythe U.S. Department of Energy, Office of Science, Office of Biological and Environmental Research. The authors also gratefullyacknowledge the computing resources provided on Blues, a highperformance computing cluster operated by the Laboratory Computing Resource Center at Argonne National Laboratory. This research used resources of the Oak Ridge Leadership Computing Facility, which is a DOE Office of Science User Facility supported under Contract DE-AC05-00OR22725. The authors also acknowledge the numerous open-source libraries on which this work depends, Harris et al. (2020), Virtanen et al. (2020), Hoyer and Hamman (2017), Dask Development Team (2016), McKinney (2010),Hunter (2007), Waskom (2021), Seabold and Perktold (2010). Theauthors also acknowledge the seven anonymous reviewers, theircomments have made this a more robust investigation. This manuscript has been authored by UT-Battelle, LLC under Contract No. DE-AC05-00OR22725 with the U.S. Department of Energy. The United States Government retains and the publisher, by accepting the article for publication, acknowledges that the United States Government retains a non-exclusive, paid-up, irrevocable, worldwide license to publish or reproduce the published form of this manuscript, or allow others to do so, for United States Government purposes. The Department of Energy will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan ( http://energy.gov/downloads/doe-public-access-plan , last access: 12 December 2025).

Fingerprint

Dive into the research topics of 'Enhanced climate reproducibility testing with false discovery rate correction'. Together they form a unique fingerprint.

Cite this