Scalable Multi-Facility Workflows for Artificial Intelligence Applications in Climate Research

Takuya Kurihana, Tyler J. Skluzacek, Rafael Ferreira Da Silva, Valentine Anantharaj

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Earth observation satellites and earth system models are sources of vast, multi-modal datasets that are invaluable for advancing climate and environmental research. However, their scale and complexity pose significant challenges for processing and analysis. In this paper we discuss our experiences in developing and using a scientific research application using an automated multi-facility workflow that orchestrates data collection, preprocessing, artificial intelligence (AI) inferencing, and data movement across diverse computational resources, leveraging the Advanced Computing Ecosystem Testbed at the Oak Ridge Leadership Computing Facility (OLCF). We demonstrate that our workflow can be seamlessly integrated and orchestrated across research facilities managed by different federal agencies, thus allowing users to extract new scientific insights from climate datasets. The experimental results indicate that the multi-facility workflow significantly reduces processing time, enhances scalability, and maintains high efficiency across varying workloads. Notably, our workflow processes 12,000 high-resolution satellite images in just 44 seconds using 80 workers distributed across 10 nodes on the OLCF systems. Such high throughput is essential for dynamic tokenization and sharding of petascale satellite data for distributed AI model training and inferencing at scale across thousands of GPUs.

Original languageEnglish
Title of host publicationProceedings of SC 2024-W
Subtitle of host publicationWorkshops of the International Conference for High Performance Computing, Networking, Storage and Analysis
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages2127-2134
Number of pages8
ISBN (Electronic)9798350355543
DOIs
StatePublished - 2024
Event2024 Workshops of the International Conference for High Performance Computing, Networking, Storage and Analysis, SC Workshops 2024 - Atlanta, United States
Duration: Nov 17 2024Nov 22 2024

Publication series

NameProceedings of SC 2024-W: Workshops of the International Conference for High Performance Computing, Networking, Storage and Analysis

Conference

Conference2024 Workshops of the International Conference for High Performance Computing, Networking, Storage and Analysis, SC Workshops 2024
Country/TerritoryUnited States
CityAtlanta
Period11/17/2411/22/24

Funding

This research used resources of the Oak Ridge Leadership Computing Facility at ORNL, which is a DOE Office of Science User Facility under Contract No. DEAC05- 00OR22725; and by the National Climate-Computing Research Center, which is located within the National Center for Computational Sciences at the ORNL and supported under a DOE/NOAA Strategic Partnership Project, 2316-T849-08.

Keywords

  • artificial intelligence
  • climate data
  • multi-facility scientific workflows
  • satellite images

Fingerprint

Dive into the research topics of 'Scalable Multi-Facility Workflows for Artificial Intelligence Applications in Climate Research'. Together they form a unique fingerprint.

Cite this