Abstract
Earth observation satellites and earth system models are sources of vast, multi-modal datasets that are invaluable for advancing climate and environmental research. However, their scale and complexity pose significant challenges for processing and analysis. In this paper we discuss our experiences in developing and using a scientific research application using an automated multi-facility workflow that orchestrates data collection, preprocessing, artificial intelligence (AI) inferencing, and data movement across diverse computational resources, leveraging the Advanced Computing Ecosystem Testbed at the Oak Ridge Leadership Computing Facility (OLCF). We demonstrate that our workflow can be seamlessly integrated and orchestrated across research facilities managed by different federal agencies, thus allowing users to extract new scientific insights from climate datasets. The experimental results indicate that the multi-facility workflow significantly reduces processing time, enhances scalability, and maintains high efficiency across varying workloads. Notably, our workflow processes 12,000 high-resolution satellite images in just 44 seconds using 80 workers distributed across 10 nodes on the OLCF systems. Such high throughput is essential for dynamic tokenization and sharding of petascale satellite data for distributed AI model training and inferencing at scale across thousands of GPUs.
Original language | English |
---|---|
Title of host publication | Proceedings of SC 2024-W |
Subtitle of host publication | Workshops of the International Conference for High Performance Computing, Networking, Storage and Analysis |
Publisher | Institute of Electrical and Electronics Engineers Inc. |
Pages | 2127-2134 |
Number of pages | 8 |
ISBN (Electronic) | 9798350355543 |
DOIs | |
State | Published - 2024 |
Event | 2024 Workshops of the International Conference for High Performance Computing, Networking, Storage and Analysis, SC Workshops 2024 - Atlanta, United States Duration: Nov 17 2024 → Nov 22 2024 |
Publication series
Name | Proceedings of SC 2024-W: Workshops of the International Conference for High Performance Computing, Networking, Storage and Analysis |
---|
Conference
Conference | 2024 Workshops of the International Conference for High Performance Computing, Networking, Storage and Analysis, SC Workshops 2024 |
---|---|
Country/Territory | United States |
City | Atlanta |
Period | 11/17/24 → 11/22/24 |
Funding
This research used resources of the Oak Ridge Leadership Computing Facility at ORNL, which is a DOE Office of Science User Facility under Contract No. DEAC05- 00OR22725; and by the National Climate-Computing Research Center, which is located within the National Center for Computational Sciences at the ORNL and supported under a DOE/NOAA Strategic Partnership Project, 2316-T849-08.
Keywords
- artificial intelligence
- climate data
- multi-facility scientific workflows
- satellite images