Durango: Scalable synthetic workload generation for extreme-scale application performance modeling and simulation

Christopher D. Carothers, Jeffrey S. Vetter, Jeremy S. Meredith, Misbah Mubarak, Shirley Moore, Mark P. Blanco, Justin Lapre

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

12 Scopus citations

Abstract

Performance modeling of extreme-scale applications on accurate representations of potential architectures is critical for designing next generation supercomputing systems because it is impractical to construct prototype systems at scale with new network hardware in order to explore designs and policies. However, these simulations often rely on static application traces that can be difficult to work with because of their size and lack of flexibility to extend or scale up without rerunning the original application. To address this problem, we have created a new technique for generating scalable, flexible workloads from real applications, we have implemented a prototype, called Durango, that combines a proven analytical performance modeling language, Aspen, with the massively parallel HPC network modeling capabilities of the CODES framework. Our models are compact, parameterized and representative of real applications with computation events. They are not resource intensive to create and are portable across simulator environments. We demonstrate the utility of Durango by simulating the LULESH application in the CODES simulation environment on several topologies and show that Durango is practical to use for simulation without loss of fidelity, as quantified by simulation metrics. During our validation of Durango's generated communication model of LULESH, we found that the original LULESH miniapp code had a latent bug where the MPI-Waitall operation was used incorrectly. This finding underscores the potential need for a tool such as Durango, beyond its benefits for flexible workload generation and modeling. Additionally, we demonstrate the efficacy of Durango's direct integration approach, which links Aspen into CODES as part of the running network simulation model. Here, Aspen generates the application-level computation timing events, which in turn drive the start of a network communication phase. Results show that Durango's performance scales well when executing both torus and dragonfly network models on up to 4K Blue Gene/Q nodes using 32K MPI ranks, Du-rango also avoids the overheads and complexities associated with extreme-scale trace files.

Original languageEnglish
Title of host publicationSIGSIM-PADS 2017 - Proceedings of the 2017 ACM SIGSIM Conference on Principles of Advanced Discrete Simulation
PublisherAssociation for Computing Machinery, Inc
Pages97-108
Number of pages12
ISBN (Electronic)9781450344890
DOIs
StatePublished - May 16 2017
Event5th ACM SIGSIM Conference on Principles of Advanced Discrete Simulation, SIGSIM-PADS 2017 - Singapore, Singapore
Duration: May 24 2017May 26 2017

Publication series

NameSIGSIM-PADS 2017 - Proceedings of the 2017 ACM SIGSIM Conference on Principles of Advanced Discrete Simulation

Conference

Conference5th ACM SIGSIM Conference on Principles of Advanced Discrete Simulation, SIGSIM-PADS 2017
Country/TerritorySingapore
CitySingapore
Period05/24/1705/26/17

Keywords

  • Hpc networks models
  • Massively parallel simulation
  • Structural analytic models

Fingerprint

Dive into the research topics of 'Durango: Scalable synthetic workload generation for extreme-scale application performance modeling and simulation'. Together they form a unique fingerprint.

Cite this