Abstract
Data assimilation (DA) in geophysical sciences remains the cornerstone of robust forecasts from numerical models. Indeed, DA plays a crucial role in the quality of numerical weather prediction and is a crucial building block that has allowed dramatic improvements in weather forecasting over the past few decades. DA is commonly framed in a variational setting, where one solves an optimization problem within a Bayesian formulation using raw model forecasts as a prior and observations as likelihood. This leads to a DA objective function that needs to be minimized, where the decision variables are the initial conditions specified to the model. In traditional DA, the forward model is numerically and computationally expensive. Here we replace the forward model with a low-dimensional, data-driven, and differentiable emulator. Consequently, gradients of our DA objective function with respect to the decision variables are obtained rapidly via automatic differentiation. We demonstrate our approach by performing an emulator-assisted DA forecast of geopotential height. Our results indicate that emulator-assisted DA is faster than traditional equation-based DA forecasts by 4 orders of magnitude, allowing computations to be performed on a workstation rather than a dedicated high-performance computer. In addition, we describe accuracy benefits of emulator-assisted DA when compared to simply using the emulator for forecasting (i.e., without DA). Our overall formulation is denoted AIEADA (Artificial Intelligence Emulator-Assisted Data Assimilation).
Original language | English |
---|---|
Pages (from-to) | 3433-3445 |
Number of pages | 13 |
Journal | Geoscientific Model Development |
Volume | 15 |
Issue number | 8 |
DOIs | |
State | Published - May 2 2022 |
Externally published | Yes |
Funding
Acknowledgements. This material is based upon work supported by Laboratory Directed Research and Development (LDRD) funding from Argonne National Laboratory provided by the director of the Office of Science of the US Department of Energy (contract no. DE-AC02-06CH11357). This material is partially based upon work supported by the Office of Advanced Scientific Computing Research of the Office of Science of the US Department of Energy (DOE; contract no. DE-AC02-06CH11357). This research was funded in part by and used resources of the Argonne Leadership Computing Facility, a DOE Office of Science User Facility (contract no. DE-AC02-06CH11357). Romit Maulik acknowledges support of the Advanced Scientific Computing Research (ASCR) project “Data-Intensive Scientific Machine Learning and Analysis” (grant no. DE-FOA-0002493). Gianmarco Mengaldo acknowledges support from an NUS (National University of Singapore) startup grant (no. 22-3565-A0001-1) and an MOE (Ministry of Education) Tier 1 grant (no. 22-4900-A0001-0). Financial support. This research has been supported by Argonne