Abstract

Sparse observations and coarse-resolution climate models limit effective regional decision-making, underscoring the need for robust downscaling. However, existing AI methods struggle with generalization across variables and geographies and are constrained by the quadratic complexity of Vision Transformer (ViT) self-attention. We introduce ORBIT-2, a scalable foundation model for global, hyper-resolution climate downscaling. ORBIT-2 incorporates two key innovations: (1) Residual Slim ViT (Reslim), a lightweight architecture with residual learning and Bayesian regularization for efficient, robust prediction; and (2) TILES, a tile-wise sequence scaling algorithm that reduces self-attention complexity from quadratic to linear, enabling long-sequence processing and massive parallelism. ORBIT-2 scales to 10 billion parameters across 65,536 GPUs, achieving up to 4.1 ExaFLOPS sustained throughput and 74-98% strong scaling efficiency. It supports downscaling to 0.9 km global resolution and processes sequences up to 4.2 billion tokens. On 7 km resolution benchmarks, ORBIT-2 achieves high accuracy with R2 scores in range of 0.98-0.99 against observation data.

Original languageEnglish
Title of host publicationProceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis, SC 2025
PublisherAssociation for Computing Machinery, Inc
Pages86-98
Number of pages13
ISBN (Electronic)9798400714665
DOIs
StatePublished - Nov 15 2025
Event2025 International Conference for High Performance Computing, Networking, Storage, and Analysis, SC 2025 - St. Louis, United States
Duration: Nov 16 2025Nov 21 2025

Publication series

NameProceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis, SC 2025

Conference

Conference2025 International Conference for High Performance Computing, Networking, Storage, and Analysis, SC 2025
Country/TerritoryUnited States
CitySt. Louis
Period11/16/2511/21/25

Funding

We thank Jonathan Coles for helping us get scalability numbers on the Alps cluster. This manuscript has been authored by UTBattelle, LLC, under contract DE-AC05-00OR22725 with the US Department of Energy (DOE). This research was primary supported by the ORNL's AI Initiative sponsored by the Director's Research and Development Program at ORNL, additionally supported by DOE Early Career Project sponsored by the BER program. It was also supported as part of the Energy Exascale Earth System Model (E3SM) project, funded by the U.S. Department of Energy Office of Science, Office of Biological and Environmental Research, and Earth Systems Model Development Program. An award of computer time was provided by the INCITE program through the Oak Ridge Leadership Computing Facility, which is a DOE Office of Science User Facility.

Fingerprint

Dive into the research topics of 'ORBIT-2: Scaling Exascale Vision Foundation Models for Weather and Climate Downscaling'. Together they form a unique fingerprint.

Cite this