The Earth System Grid Federation: An open infrastructure for access to distributed geospatial data

Luca Cinquini, Daniel Crichton, Chris Mattmann, John Harney, Galen Shipman, Feiyi Wang, Rachana Ananthakrishnan, Neill Miller, Sebastian Denvil, Mark Morgan, Zed Pobre, Gavin M. Bell, Charles Doutriaux, Robert Drach, Dean Williams, Philip Kershaw, Stephen Pascoe, Estanislao Gonzalez, Sandro Fiore, Roland Schweitzer

Research output: Contribution to journalArticlepeer-review

159 Scopus citations

Abstract

The Earth System Grid Federation (ESGF) is a multi-agency, international collaboration that aims at developing the software infrastructure needed to facilitate and empower the study of climate change on a global scale. The ESGF's architecture employs a system of geographically distributed peer nodes, which are independently administered yet united by the adoption of common federation protocols and application programming interfaces (APIs). The cornerstones of its interoperability are the peer-to-peer messaging that is continuously exchanged among all nodes in the federation; a shared architecture and API for search and discovery; and a security infrastructure based on industry standards (OpenID, SSL, GSI and SAML). The ESGF software stack integrates custom components (for data publishing, searching, user interface, security and messaging), developed collaboratively by the team, with popular application engines (Tomcat, Solr) available from the open source community. The full ESGF infrastructure has now been adopted by multiple Earth science projects and allows access to petabytes of geophysical data, including the entire Fifth Coupled Model Intercomparison Project (CMIP5) output used by the Intergovernmental Panel on Climate Change (IPCC) Fifth Assessment Report (AR5) and a suite of satellite observations (obs4MIPs) and reanalysis data sets (ANA4MIPs). This paper presents ESGF as a successful example of integration of disparate open source technologies into a cohesive, wide functional system, and describes our experience in building and operating a distributed and federated infrastructure to serve the needs of the global climate science community.

Original languageEnglish
Pages (from-to)400-417
Number of pages18
JournalFuture Generation Computer Systems
Volume36
DOIs
StatePublished - Jul 2014

Funding

The development and operation of ESGF is supported by the efforts of principal investigators, software engineers, data managers and system administrators from many agencies and institutions worldwide. Primary contributors include ANL, ANU, BADC, CMCC, DKRZ, ESRL, GFDL, GSFC, JPL, IPSL, NCAR, ORNL, LBNL, LLNL (leading institution), PMEL, PNNL and SNL. Major funding provided by the U.S. Department of Energy , the National Atmospheric and Space Administration (NASA) , and the European Infrastructure for the European Network for Earth System Modeling (IS-ENES) .

FundersFunder number
European Infrastructure for the European Network for Earth System Modeling
IS-ENES
U.S. Department of Energy
National Oceanic and Atmospheric Administration

    Keywords

    • CMIP5
    • Climate science
    • Discovery
    • Federation
    • Peer-to-peer
    • Search

    Fingerprint

    Dive into the research topics of 'The Earth System Grid Federation: An open infrastructure for access to distributed geospatial data'. Together they form a unique fingerprint.

    Cite this