An Agenda for Multimodal Foundation Models for Earth Observation

Philipe Dias, Abhishek Potnis, Sreelekha Guggilam, Lexie Yang, Aristeidis Tsaris, Henry Medeiros, Dalton Lunga

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

2 Scopus citations

Abstract

Archives of remote sensing (RS) data are increasing swiftly as new sensing modalities with enhanced spatiotemporal resolution become operational. While promising new breakthroughs, the sheer volume of RS archives stretches the limits of human analysts and existing AI tools, as most models are: i) limited to single data modalities; ii) task-specific; iii) heavily reliant on labeled data. The emerging Foundation Models (FMs) have the potential to address these limitations. Trained on vast unlabeled datasets through self-supervised learning, FMs enable generic feature extraction that facilitate specialization to a wide variety of downstream tasks. This paper describes a vision towards an FM for multimodal Earth Observation data (FM4EO), discussing key building blocks and open challenges. We put particular emphasis on multimodal reasoning, a topic underexplored in EO. Our ultimate goal is a practical path toward FM4EO with capacity to unlock breakthroughs in few-shot learning scenarios, multimodal geographic knowledge integration, synthesis, and hypothesis generation.

Original languageEnglish
Title of host publicationIGARSS 2023 - 2023 IEEE International Geoscience and Remote Sensing Symposium, Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages1237-1240
Number of pages4
ISBN (Electronic)9798350320107
DOIs
StatePublished - 2023
Externally publishedYes
Event2023 IEEE International Geoscience and Remote Sensing Symposium, IGARSS 2023 - Pasadena, United States
Duration: Jul 16 2023Jul 21 2023

Publication series

NameInternational Geoscience and Remote Sensing Symposium (IGARSS)
Volume2023-July

Conference

Conference2023 IEEE International Geoscience and Remote Sensing Symposium, IGARSS 2023
Country/TerritoryUnited States
CityPasadena
Period07/16/2307/21/23

Funding

This manuscript has been authored by UT-Battelle, LLC, under contract DE-AC05-00OR22725 with the US Department of Energy (DOE). The US government retains and the publisher, by accepting the article for publication, acknowledges that the US government retains a nonexclusive, paid-up, irrevocable, worldwide license to publish or reproduce the published form of this manuscript, or allow others to do so, for US government purposes. DOE will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan (http://energy.gov/downloads/doe-public-access-plan).

Keywords

  • Foundation Model
  • earth observation
  • multimodal reasoning
  • remote sensing
  • self-supervision

Fingerprint

Dive into the research topics of 'An Agenda for Multimodal Foundation Models for Earth Observation'. Together they form a unique fingerprint.

Cite this