SODA: Science-driven orchestration of data analytics

Jai Dayal, Jay Lofstead, Greg Eisenhauer, Karsten Schwan, Matthew Wolf, Hasan Abbasi, Scott Klasky

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

6 Scopus citations

Abstract

As scientific simulation applications evolve on the path towards exascale, a new model of scientific inquiry is required where concurrently with the running simulation, online analytics operate on the data it produces. By avoiding offline data storage except when absoluately necessary, it enables speeding up the scientific discovery process by providing rapid insights into the simulated science phenomena and affording more frequent, detailed data analytics than is possible with the traditional purely offline approach of using disk for intermediate data storage. However, a challenge for online analytics is to respond to behavior dynamics caused by changing simulation outputs and by unforeseen events on the underlying hardware/software platforms. This paper presents SODA, a set of run-time abstractions for online orchestration of data analytics, realized by embedding analytics tasks into workstations that monitor component behavior and enable responses to run-time changes in their resource demands and in the platform's resource availability. For high end simulations running on a leadership class machine, experimental evaluations show SODA can invoke efficient orchestration operations responding to a diverse set of run-time dynamics at different granularities to meet end-user and analysis specific requirements.

Original languageEnglish
Title of host publicationProceedings - 11th IEEE International Conference on eScience, eScience 2015
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages475-484
Number of pages10
ISBN (Electronic)9781467393256
DOIs
StatePublished - Oct 22 2015
Event11th IEEE International Conference on eScience, eScience 2015 - Munich, Germany
Duration: Aug 31 2015Sep 4 2015

Publication series

NameProceedings - 11th IEEE International Conference on eScience, eScience 2015

Conference

Conference11th IEEE International Conference on eScience, eScience 2015
Country/TerritoryGermany
CityMunich
Period08/31/1509/4/15

Funding

This research was supported by the Department of Energy Office of Advanced Scientific Computing Research. It also used resources of the Oak Ridge Leadership Computing Facility at the Oak Ridge National Laboratory, which is supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC05-00OR22725. Sandia National Laboratories is a multi-program laboratory managed and operated by Sandia Corporation, a wholly owned subsidiary of Lockheed Martin Corporation, for the U.S. Department of Energy’s National Nuclear Security Administration under contract DE-AC04-94AL85000. SAND2014-19011 C.

FundersFunder number
Department of Energy Office of Advanced Scientific Computing Research
U.S. Department of EnergyDE-AC05-00OR22725
Lockheed Martin Corporation
Office of Science
National Nuclear Security AdministrationDE-AC04-94AL85000, SAND2014-19011

    Keywords

    • Data analytics
    • Data staging
    • In-situ
    • Resource sharing
    • Runtime management
    • Scalable I/O
    • Visualization

    Fingerprint

    Dive into the research topics of 'SODA: Science-driven orchestration of data analytics'. Together they form a unique fingerprint.

    Cite this