Scaling to a million cores and beyond: Using light-weight simulation to understand the challenges ahead on the road to exascale

Research output: Contribution to journalArticlepeer-review

36 Scopus citations

Abstract

As supercomputers scale to 1000 PFlop/s over the next decade, investigating the performance of parallel applications at scale on future architectures and the performance impact of different architecture choices for high-performance computing (HPC) hardware/software co-design is crucial. This paper summarizes recent efforts in designing and implementing a novel HPC hardware/software co-design toolkit. The presented Extreme-scale Simulator (xSim) permits running an HPC application in a controlled environment with millions of concurrent execution threads while observing its performance in a simulated extreme-scale HPC system using architectural models and virtual timing. This paper demonstrates the capabilities and usefulness of the xSim performance investigation toolkit, such as its scalability to 227 simulated Message Passing Interface (MPI) ranks on 960 real processor cores, the capability to evaluate the performance of different MPI collective communication algorithms, and the ability to evaluate the performance of a basic Monte Carlo application with different architectural parameters.

Original languageEnglish
Pages (from-to)59-65
Number of pages7
JournalFuture Generation Computer Systems
Volume30
Issue number1
DOIs
StatePublished - 2014

Funding

This research is sponsored by the Office of Advanced Scientific Computing Research , US Department of Energy (DOE) . This manuscript has been authored by UT-Battelle, LLC, under Contract No. DE-AC05-00OR22725 with the DOE. The United States Government retains and the publisher, by accepting the article for publication, acknowledges that the United States Government retains a non-exclusive, paid-up, irrevocable, world-wide license to publish or reproduce the published form of this manuscript, or allow others to do so, for United States Government purposes.

FundersFunder number
US Department of Energy
U.S. Department of Energy
Advanced Scientific Computing Research

    Keywords

    • Collective communication
    • Exascale
    • High performance computing
    • Message passing interface
    • Parallel discrete event simulation

    Fingerprint

    Dive into the research topics of 'Scaling to a million cores and beyond: Using light-weight simulation to understand the challenges ahead on the road to exascale'. Together they form a unique fingerprint.

    Cite this