IRIS-GNN: Leveraging Graph Neural Networks for Scheduling on Truly Heterogeneous Runtime Systems

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

The diversity of accelerators in computer systems poses significant challenges for software developers, such as managing vendor-specific compiler toolchains, code fragmentation requiring different kernel implementations, and performance portability issues. To address these, the Intelligent Runtime System (IRIS) was developed. IRIS works across various systems, from smartphones to supercomputers, enabling automatic performance scaling based on available accelerators. It introduces abstract tasks for seamless execution transitions between accelerators while ensuring memory consistency and task dependencies. Although IRIS simplifies system details, optimal dynamic scheduling still requires user input to understand workload structures. To address this, we introduce a new scheduling policy for IRIS, termed IRIS-GNN, which is the first IRIS hybrid policy that operates in conjunction with the dynamic policies. This policy employs a Graph-Neural Network (GNN) to conduct Graph Classification of any task graphs submitted to IRIS. This GNN analyzes the structure and attributes of the task graph, categorizing it as either locality, concurrency, or mixed. This classification subsequently guides the selection of the dynamic policy used by IRIS. We provide a comparison of the performance of IRIS-GNN against the complete spectrum of IRIS's dynamic policies, assess the overhead introduced by the GNN within this scheduling framework, and ultimately explore its practical application in real-world scenarios.

Original languageEnglish
Title of host publicationProceedings of SC 2024-W
Subtitle of host publicationWorkshops of the International Conference for High Performance Computing, Networking, Storage and Analysis
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages1071-1080
Number of pages10
ISBN (Electronic)9798350355543
DOIs
StatePublished - 2024
Event2024 Workshops of the International Conference for High Performance Computing, Networking, Storage and Analysis, SC Workshops 2024 - Atlanta, United States
Duration: Nov 17 2024Nov 22 2024

Publication series

NameProceedings of SC 2024-W: Workshops of the International Conference for High Performance Computing, Networking, Storage and Analysis

Conference

Conference2024 Workshops of the International Conference for High Performance Computing, Networking, Storage and Analysis, SC Workshops 2024
Country/TerritoryUnited States
CityAtlanta
Period11/17/2411/22/24

Keywords

  • accelerators
  • high-performance computing
  • runtime systems
  • scheduling

Fingerprint

Dive into the research topics of 'IRIS-GNN: Leveraging Graph Neural Networks for Scheduling on Truly Heterogeneous Runtime Systems'. Together they form a unique fingerprint.

Cite this