Proximity Portability and in Transit, M-to-N Data Partitioning and Movement in SENSEI

E. Wes Bethel, Burlen Loring, Utkarsh Ayachit, Earl P.N. Duque, Nicola Ferrier, Joseph Insley, Junmin Gu, James Kress, Patrick O’Leary, Dave Pugmire, Silvio Rizzi, David Thompson, Will Usher, Gunther H. Weber, Brad Whitlock, Matthew Wolf, Kesheng Wu

Research output: Chapter in Book/Report/Conference proceedingChapterpeer-review

1 Scopus citations

Abstract

In high-performance parallel in situ processing, the term in transit processing refers to those configurations where data must move from a producer to a consumer that runs on separate resources. In the context of parallel and distributed computing on an HPC platform one of the central challenges is to determine a mapping of data from producer ranks to consumer ranks. This problem is complicated by the heterogeneity that arises in producer-consumer pairs, such as when producer and consumer codes have different levels of concurrency, different scaling characteristics, or different data models. The resulting mapping and movement of data from M producer to N consumer ranks can have a significant impact on aggregate application performance, particularly when the data consumer requires only a subset of the overall data for its task. This chapter focuses on the design considerations that underlie SENSEI’s implementation to this challenging problem. These design considerations extend the core SENSEI architecture and include ideas like the need to accommodate flexibility in the choice of different partitioning methods, the ability for a data consumer to request and receive only the subset of data needed for its particular operation, and the ability to leverage any of several different data transport tools. The idea of proximity portability, being able to use different data transport methods as part of an in transit workflow, is illustrated through the use of three different transport layers where switching from one transport tool to another is accomplished with only a configuration file change. The chapter also includes a performance analysis summary showing the performance gains that are possible in terms of multiple metrics, such as memory footprint, time to solution, and amount of data moved, when using optimized partitioners in an in transit setting, gains that are made possible by the implementation shaped by specific design considerations.

Original languageEnglish
Title of host publicationMathematics and Visualization
PublisherSpringer Science and Business Media Deutschland GmbH
Pages439-460
Number of pages22
DOIs
StatePublished - 2022

Publication series

NameMathematics and Visualization
ISSN (Print)1612-3786
ISSN (Electronic)2197-666X

Funding

Acknowledgements This work was supported by the Director, Office of Science, Office of Advanced Scientific Computing Research, of the U.S. Department of Energy under Contract Nos. DE-AC02-05CH11231 and DE-AC01-06CH11357, through the grant “Scalable Analysis Methods and In Situ Infrastructure for Extreme Scale Knowledge Discovery,” program manager Dr. Laura Biven. This research used resources of the National Energy Research Scientific Computing Center, a DOE Office of Science User Facility supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC02-05CH11231. Argonne National Laboratory’s work was supported by and used the resources of the Argonne Leadership Computing Facility, which is a U.S. Department of Energy, Office of Science User Facility supported under contract DE-AC02-06CH11357.

Fingerprint

Dive into the research topics of 'Proximity Portability and in Transit, M-to-N Data Partitioning and Movement in SENSEI'. Together they form a unique fingerprint.

Cite this