Measuring the impact of burst buffers on data-intensive scientific workflows

Rafael Ferreira da Silva, Scott Callaghan, Tu Mai Anh Do, George Papadimitriou, Ewa Deelman

Research output: Contribution to journalArticlepeer-review

9 Scopus citations

Abstract

Science applications frequently produce and consume large volumes of data, but delivering this data to and from compute resources can be challenging, as parallel file system performance is not keeping up with compute and memory performance. To mitigate this I/O bottleneck, some systems have deployed burst buffers, but their impact on performance for real-world scientific workflow applications is still not clear. In this paper, we examine the impact of burst buffers through the remote-shared, allocatable burst buffers on the Cori system at NERSC. By running two data-intensive workflows, a high-throughput genome analysis workflow, and a subset of the SCEC high-performance CyberShake workflow, a production seismic hazard analysis workflow, we find that using burst buffers offers read and write improvements of an order of magnitude, and these improvements lead to increased job performance, and thereby increased overall workflow performance, even for long-running CPU-bound jobs.

Original languageEnglish
Pages (from-to)208-220
Number of pages13
JournalFuture Generation Computer Systems
Volume101
DOIs
StatePublished - Dec 2019
Externally publishedYes

Funding

This work was funded by DOE contract number #DESC0012636, “Panorama – Predictive Modeling and Diagnostic Monitoring of Extreme Science Workflows”, by NSF, USA contract number #1664162, “ SI2-SSI: Pegasus: Automating Compute and Data Intensive Science”, and by NSF contract number #1741040, “BIGDATA: IA: Collaborative Research: In Situ Data Analytics for Next Generation Molecular Dynamics Workflows”. This research used resources of the National Energy Research Scientific Computing Center, a DOE Office of Science User Facility supported by the Office of Science of the U.S. Department of Energy, United States under Contract No. DE-AC02-05CH11231. CyberShake workflow research was supported by the National Science Foundation (NSF), USA under the OAC SI2-SSI grant #1148493, the OAC SI2-SSI grant #1450451, and EAR grant #1226343. This research was supported by the Southern California Earthquake Center, USA (Contribution No. 7610). SCEC is funded by NSF Cooperative Agreement EAR-1033462 & USGS Cooperative Agreement G12AC20038. CyberShake workflow research was supported by the National Science Foundation (NSF), USA under the OAC SI2-SSI grant #1148493 , the OAC SI2-SSI grant #1450451 , and EAR grant #1226343 . This research was supported by the Southern California Earthquake Center, USA (Contribution No. 7610 ). SCEC is funded by NSF Cooperative Agreement EAR-1033462 & USGS Cooperative Agreement G12AC20038 . This work was funded by DOE contract number #DESC0012636 , “Panorama – Predictive Modeling and Diagnostic Monitoring of Extreme Science Workflows”, by NSF, USA contract number #1664162 , “ SI2-SSI: Pegasus: Automating Compute and Data Intensive Science”, and by NSF contract number #1741040, “BIGDATA: IA: Collaborative Research: In Situ Data Analytics for Next Generation Molecular Dynamics Workflows”. This research used resources of the National Energy Research Scientific Computing Center, a DOE Office of Science User Facility supported by the Office of Science of the U.S. Department of Energy, United States under Contract No. DE-AC02-05CH11231 .

FundersFunder number
DOE Office of Science
National Science FoundationEAR-1033462, 1664162, 1841758, 1741040, 1450451, G12AC20038, 1148493
U.S. Department of EnergyDE-AC02-05CH11231, #DESC0012636, 0012636
Division of Earth Sciences1226343
U.S. Geological Survey
Office of Science
Southern California Earthquake Center7610

    Keywords

    • Burst buffers
    • High-performance computing
    • In transit processing
    • Scientific workflows

    Fingerprint

    Dive into the research topics of 'Measuring the impact of burst buffers on data-intensive scientific workflows'. Together they form a unique fingerprint.

    Cite this