Abstract
High-Performance Computing is becoming increasingly heterogeneous, relying on a diverse mix of hardware to achieve good performance. Paradoxically, current drivers and frameworks for these devices typically require separate languages and implementations for each vendor. Furthermore, there are few tools and little support to schedule codes between these devices in a truly heterogeneous manner-partly because of this fragmentation between vendors and the languages each supports. To overcome both limitations, the Intelligent Runtime System (IRIS) was developed. It allows a common task abstraction to automatically be shared among contemporary vendors and is run from a single host-side API. At runtime, IRIS queries the host system and registers which frameworks and drivers are available, these determine which kernels can be used by the scheduler-CPUs via OpenMP, Nvidia GPUs (CUDA), AMD GPUs (HIP), and Intel and Xilinx FPGAs with OpenCL. IRIS enables tasks to be scheduled to any heterogeneous device and resolves to the appropriate kernel binary at runtimeit only uses the devices supported by the system on which it is run. IRIS supports single-task and graph-based expressions of dependencies of tasks. Additionally, IRIS features a range of dynamic scheduling policies, allowing complex chains of tasks and interactions to be executed, relieving the programmer/user from considering the system to assign tasks to devices optimally. This paper presents the peak performance attainable by IRIS over a range of systems-each with different numbers and types of accelerator devices, it highlights the flexibility of IRIS since these devices are truly heterogeneous, relying on different backends (drivers, frameworks, and languages) which historically required unique implementations to utilize them. We then use this peak performance as a baseline to compare increasingly complex chains of tasks (with increasingly complex task dependencies) and evaluate how IRIS copes. Finally, we consider the performance of different IRIS scheduling policies on this range of task graphs.
Original language | English |
---|---|
Title of host publication | 2024 IEEE International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2024 |
Publisher | Institute of Electrical and Electronics Engineers Inc. |
Pages | 58-67 |
Number of pages | 10 |
ISBN (Electronic) | 9798350364606 |
DOIs | |
State | Published - 2024 |
Event | 2024 IEEE International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2024 - San Francisco, United States Duration: May 27 2024 → May 31 2024 |
Publication series
Name | 2024 IEEE International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2024 |
---|
Conference
Conference | 2024 IEEE International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2024 |
---|---|
Country/Territory | United States |
City | San Francisco |
Period | 05/27/24 → 05/31/24 |
Funding
This research used resources of the Experimental Computing Laboratory (ExCL) and the Oak Ridge Leadership Computing Facility (OLCF) at the Oak Ridge National Laboratory, which is supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC05-00OR22725. This research was supported by the following sources: 1) Defense Advanced Research Projects Agency (DARPA) Microsystems Technology Office (MTO) Domain-Specific System-on-Chip Program and 2) U.S. Department of Defense Advanced Computing Initiative (ACI), Brisbane: Productive Programming Systems in the Era of Extremely Heterogeneous and Ephemeral Computer Architectures. This manuscript has been co-authored by UT-Battelle, LLC under Contract No. DEAC05-00OR22725 with the U.S. Department of Energy. The United States Government retains and the publisher, by accepting the article for publication, acknowledges that the United States Government retains a non-exclusive, paid-up, irrevocable, worldwide license to publish or reproduce the published form of this manuscript, or allow others to do so, for United States Government purposes. The Department of Energy will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan.
Keywords
- Execution Model
- Heterogeneous Computing
- Heterogeneous Systems
- High-Performance Computing
- Performance Portability
- Programming Models
- Runtime System
- Scheduling Policy
- Task Schedule