On scalability for MPI runtime systems

George Bosilca, Thomas Herault, Ala Rezmerita, Jack Dongarra

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

5 Scopus citations

Abstract

The future of high performance computing, as being currently foretold, will gravitate toward hundreds of thousands to million node machines, harnessing the computing power of billions of cores. While the hardware part is well covered, the software infrastructure at that scale is vague. However, no matter what the infrastructure will be, efficiently running parallel applications on such large machines will require optimized runtime environments that are scalable and resilient. More particularly, considering a future where Message Passing Interface (MPI) remains a major programming paradigm, the MPI implementations will have to seamlessly adapt to launching and managing large scale applications on resources several levels of magnitude larger than today. In this paper, we present a modified version of the Open MPI runtime that has been adapted towards a scalability goal. We evaluate the performance and compare it with two widely used runtime systems: the default version of Open MPI and MPICH2; using various underlying launching systems. The performance evaluation demonstrates a significant improvement over the state of the art. We also discuss the basic requirements for an exascale-ready parallel runtime.

Original languageEnglish
Title of host publicationProceedings - 2011 IEEE International Conference on Cluster Computing, CLUSTER 2011
Pages187-195
Number of pages9
DOIs
StatePublished - 2011
Externally publishedYes
Event2011 IEEE International Conference on Cluster Computing, CLUSTER 2011 - Austin, TX, United States
Duration: Sep 26 2011Sep 30 2011

Publication series

NameProceedings - IEEE International Conference on Cluster Computing, ICCC
ISSN (Print)1552-5244

Conference

Conference2011 IEEE International Conference on Cluster Computing, CLUSTER 2011
Country/TerritoryUnited States
CityAustin, TX
Period09/26/1109/30/11

Fingerprint

Dive into the research topics of 'On scalability for MPI runtime systems'. Together they form a unique fingerprint.

Cite this