TY - JOUR
T1 - Taking the MPI standard and the open MPI library to exascale
AU - Bernholdt, David E.
AU - Bosilca, George
AU - Bouteiller, Aurelien
AU - Brightwell, Ron
AU - Ciesko, Jan
AU - Dosanjh, Matthew G.F.
AU - Georgakoudis, Giorgis
AU - Laguna, Ignacio
AU - Levy, Scott
AU - Naughton, Thomas
AU - Olivier, Stephen L.
AU - Pritchard, Howard P.
AU - Schonbein, Whit
AU - Schuchart, Joseph
AU - Shehata, Amir
N1 - Publisher Copyright:
© The Author(s) 2024.
PY - 2024
Y1 - 2024
N2 - The Open MPI for Exascale (OMPI-X) project was one of two in the Exascale Computing Project (ECP) focused on advancing the MPI ecosystem. The OMPI-X team worked with other MPI Forum members to champion several important features for inclusion in the MPI 4.0, 4.1, and upcoming 5.0 MPI standard versions, in support of the needs of exascale applications and systems. The team also worked with the larger Open MPI community to bring implementations of these new features and other enhancements into Open MPI, one of the leading open-source implementations of the MPI interface. This paper describes the motivation for the work of the OMPI-X project in the context of exascale computing needs, the nature of the resulting new capabilities in the MPI standard, and how they were implemented in the Open MPI library. Features include improved support for “MPI + X” programming models through partitioned communications and support for user-level threading, sessions, fault tolerance through the user-level fault mitigation (ULFM) and Reinit models, and other features. We also discuss enhancements to Open MPI providing improved performance and scalability for existing features, such as collective operations, one-sided operations, support for the Slingshot-11 interconnect of the initial exascale systems, and how the OMPI-X team worked to improve quality assurance for the Open MPI library, particularly on platforms of interest to the Department of Energy community.
AB - The Open MPI for Exascale (OMPI-X) project was one of two in the Exascale Computing Project (ECP) focused on advancing the MPI ecosystem. The OMPI-X team worked with other MPI Forum members to champion several important features for inclusion in the MPI 4.0, 4.1, and upcoming 5.0 MPI standard versions, in support of the needs of exascale applications and systems. The team also worked with the larger Open MPI community to bring implementations of these new features and other enhancements into Open MPI, one of the leading open-source implementations of the MPI interface. This paper describes the motivation for the work of the OMPI-X project in the context of exascale computing needs, the nature of the resulting new capabilities in the MPI standard, and how they were implemented in the Open MPI library. Features include improved support for “MPI + X” programming models through partitioned communications and support for user-level threading, sessions, fault tolerance through the user-level fault mitigation (ULFM) and Reinit models, and other features. We also discuss enhancements to Open MPI providing improved performance and scalability for existing features, such as collective operations, one-sided operations, support for the Slingshot-11 interconnect of the initial exascale systems, and how the OMPI-X team worked to improve quality assurance for the Open MPI library, particularly on platforms of interest to the Department of Energy community.
KW - Exascale computing project
KW - message passing interface
UR - http://www.scopus.com/inward/record.url?scp=85199883173&partnerID=8YFLogxK
U2 - 10.1177/10943420241265936
DO - 10.1177/10943420241265936
M3 - Article
AN - SCOPUS:85199883173
SN - 1094-3420
JO - International Journal of High Performance Computing Applications
JF - International Journal of High Performance Computing Applications
ER -