Abstract
The earth sciences research community has an unprecedented opportunity to exploit the vast amount of data available from earth observation (EO) satellites and earth system models (ESM). The ascent and application of artificial intelligence foundation models (FM) can be attributed to the availability of large volumes of curated data, access to extensive computing resources and the maturity of deep learning techniques. Vision transformers (ViT) architectures have been adapted for image and image-like data, such as EO data and ESM simulation output. Pretraining foundation models is a compute intensive process, often requiring 105 - 107 GPU hours for large scale scientific applications. There is a limited body of knowledge on compute optimal methods for pretraining, necessitating a trial and error process. We have performed a series of experiments using ViT backbones at different scales to understand optimal and cost-effective ways to improve scientific throughput. This preliminary benchmark provides an assessment of which architectures and model configurations are favorable in a given scientific context.
| Original language | English |
|---|---|
| Title of host publication | IGARSS 2024 - 2024 IEEE International Geoscience and Remote Sensing Symposium, Proceedings |
| Publisher | Institute of Electrical and Electronics Engineers Inc. |
| Pages | 3085-3088 |
| Number of pages | 4 |
| ISBN (Electronic) | 9798350360325 |
| DOIs | |
| State | Published - 2024 |
| Event | 2024 IEEE International Geoscience and Remote Sensing Symposium, IGARSS 2024 - Athens, Greece Duration: Jul 7 2024 → Jul 12 2024 |
Publication series
| Name | International Geoscience and Remote Sensing Symposium (IGARSS) |
|---|
Conference
| Conference | 2024 IEEE International Geoscience and Remote Sensing Symposium, IGARSS 2024 |
|---|---|
| Country/Territory | Greece |
| City | Athens |
| Period | 07/7/24 → 07/12/24 |
Funding
This manuscript has been authored by UT-Battelle, LLC under Contract No. DE-AC05-00OR22725 with the U.S. Department of Energy. The publisher, by accepting the article for publication, acknowledges that the U.S. Government retains a non-exclusive, paid up, irrevocable, world-wide license to publish or reproduce the published form of the manuscript, or allow others to do so, for U.S. Government purposes. The DOE will provide public access to these results in accordance with the DOE Public Access Plan (http://energy.gov/downloads/doe-public-access-plan). This research used resources of the Oak Ridge Leadership Computing Facility at the Oak Ridge National Laboratory, which is also supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC05-00OR22725. Takuya Kurihana performed the work while at the University of Chicago in the USA.
Keywords
- High performance computing
- artificial intelligence
- benchmarking
- foundation models