Abstract
The ubiquitous in-node heterogeneity of HPC and cloud computing platforms makes software portability and performance optimization extremely challenging. Described here, the MatRIS multilevel math library abstraction framework employs tasking to alleviate these difficulties. MatRIS includes the IRIS task-based runtime on the bottom level and exposes different layers of abstraction to render algorithms architecturally agnostic. MatRIS ensures the decomposition and creation of tasks that represent the necessary encapsulation of the optimized kernels from both vendor and open-source math libraries. Once built, MatRIS can select different combinations of accelerators at runtime, making it portable even on diverse heterogeneous architectures. By leveraging the IRIS runtime’s features for managing heterogeneity, MatRIS deploys algorithms that remove the need to specify orchestration and data transfer. This study describes how the serial task abstraction of a tiled Cholesky factorization is made portable and scalable in the case of multi-device and multi-vendor heterogeneity on a node with NVIDIA and AMD GPUs by using MatRIS. First, we demonstrate that Cholesky in MatRIS provides multi-GPU scalability that offers competitive performance versus cuSolverMG. Then, we present the challenges and opportunities for heterogeneous execution.
| Original language | English |
|---|---|
| Title of host publication | Asynchronous Many-Task Systems and Applications - 2nd International Workshop, WAMTA 2024, Proceedings |
| Editors | Patrick Diehl, Joseph Schuchart, Pedro Valero-Lara, George Bosilca |
| Publisher | Springer Science and Business Media Deutschland GmbH |
| Pages | 59-70 |
| Number of pages | 12 |
| ISBN (Print) | 9783031617621 |
| DOIs | |
| State | Published - 2024 |
| Event | 2nd International Workshop on Asynchronous Many-Task Systems and Applications, WAMTA 2024 - Knoxville, United States Duration: Feb 14 2024 → Feb 16 2024 |
Publication series
| Name | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
|---|---|
| Volume | 14626 LNCS |
| ISSN (Print) | 0302-9743 |
| ISSN (Electronic) | 1611-3349 |
Conference
| Conference | 2nd International Workshop on Asynchronous Many-Task Systems and Applications, WAMTA 2024 |
|---|---|
| Country/Territory | United States |
| City | Knoxville |
| Period | 02/14/24 → 02/16/24 |
Funding
This work is funded, in part, by Bluestone, a X-Stack project in the DOE Advanced Scientific Computing Office with program manager Hal Finkel. This manuscript has been authored by UT-Battelle LLC under contract no. DE-AC0500OR22725 with the US Department of Energy. The publisher, by accepting the article for publication, acknowledges that the US government retains a non-exclusive, paid up, irrevocable, world-wide license to publish or reproduce the published form of the manuscript, or allow others to do so, for US government purposes. The DOE will provide public access to these results in accordance with the DOE Public Access Plan (http://energy.gov/downloads/doe-public-access-plan).
Keywords
- Cholesky Decomposition
- Heterogeneity
- Math Library
- POTRF
- Portability
- Runtime System
- Task based programming
Fingerprint
Dive into the research topics of 'MatRIS: Addressing the Challenges for Portability and Heterogeneity Using Tasking for Matrix Decomposition (Cholesky)'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver