DataStager: Scalable data staging services for petascale applications

Hasan Abbasi, Matthew Wolf, Greg Eisenhauer, Scott Klasky, Karsten Schwan, Fang Zheng

    Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

    120 Scopus citations

    Abstract

    Known challenges for petascale machines are that (1) the costs of I/O for high performance applications can be substantial, especially for output tasks like checkpointing, and (2) noise from I/O actions can inject undesirable delays into the runtimes of such codes on individual compute nodes. This paper introduces the flexible 'DataStager' framework for data staging and alternative services within that jointly address (1) and (2). Data staging services moving output data from compute nodes to staging or I/O nodes prior to storage are used to reduce I/O overheads on applications' total processing times, and explicit management of data staging offers reduced perturbation when extracting output data from a petascale machine's compute partition. Experimental evaluations of DataStager on the Cray XT machine at Oak Ridge National Laboratory establish both the necessity of intelligent data staging and the high performance of our approach, using the GTC fusion modeling code and benchmarks running on 1000+ processors.

    Original languageEnglish
    Title of host publicationProc. 18th ACM International Symposium on High Performance Distributed Computing, HPDC 09, Co-located with the 2009 International Symposium on High Performance Distributed Computing Conf., HPDC'09
    Pages31-37
    Number of pages7
    DOIs
    StatePublished - 2009
    Event18th ACM International Symposium on High Performance Distributed Computing, HPDC 09, Co-located with the 2009 International Symposium on High Performance Distributed Computing Conference, HPDC'09 - Garching, Germany
    Duration: Jun 11 2009Jun 13 2009

    Publication series

    NameProc. 18th ACM International Symposium on High Performance Distributed Computing, HPDC 09, Co-located with the 2009 International Symposium on High Performance Distributed Computing Conf., HPDC'09

    Conference

    Conference18th ACM International Symposium on High Performance Distributed Computing, HPDC 09, Co-located with the 2009 International Symposium on High Performance Distributed Computing Conference, HPDC'09
    Country/TerritoryGermany
    CityGarching
    Period06/11/0906/13/09

    Keywords

    • Datatap
    • GTC
    • I/O
    • Staging
    • WARP
    • XT3
    • XT4

    Fingerprint

    Dive into the research topics of 'DataStager: Scalable data staging services for petascale applications'. Together they form a unique fingerprint.

    Cite this