Abstract
The sequential task flow (STF) model is the main-stream approach for interacting with task-based runtime systems, with StarPU and the Dynamic task discovery (DTD) in PaRSEC being two implementations of this model. Compared with other approaches of submitting tasks into a runtime system, STF has interesting advantages centered around an easy-to-use API, that allows users to expressed algorithms as a sequence of tasks (much like in OpenMP), while allowing the runtime to automatically identify and analyze the task dependencies and scheduling. In this paper, we focus on the DTD interface in PaRSEC, highlight some of its lesser known limitations and implemented two optimization techniques for DTD: support for user level graph trimming, and a new API for broadcast read-only data to remote tasks. We then analyze the benefits and limitations of these optimizations with benchmarks as well as on two common matrix factorization kernels Cholesky and QR, on two different systems Shaheen II from KAUST and Fugaku from RIKEN. We point out some potential for further improvements, and provided valuable insights into the strength and weakness of STF model. hoping to guide the future developments of task-based runtime systems.
Original language | English |
---|---|
Title of host publication | Proceedings of ROSS 2022 |
Subtitle of host publication | International Workshop on Runtime and Operating Systems for Supercomputers, Held in conjunction with SC 2022: The International Conference for High Performance Computing, Networking, Storage and Analysis |
Publisher | Institute of Electrical and Electronics Engineers Inc. |
Pages | 18-25 |
Number of pages | 8 |
ISBN (Electronic) | 9781665475662 |
DOIs | |
State | Published - 2022 |
Event | 12th IEEE/ACM International Workshop on Runtime and Operating Systems for Supercomputers, ROSS 2022 - Dallas, United States Duration: Nov 13 2022 → Nov 18 2022 |
Publication series
Name | Proceedings of ROSS 2022: International Workshop on Runtime and Operating Systems for Supercomputers, Held in conjunction with SC 2022: The International Conference for High Performance Computing, Networking, Storage and Analysis |
---|
Conference
Conference | 12th IEEE/ACM International Workshop on Runtime and Operating Systems for Supercomputers, ROSS 2022 |
---|---|
Country/Territory | United States |
City | Dallas |
Period | 11/13/22 → 11/18/22 |
Funding
For computer time, this research used the resources of the Supercomputing Laboratory (KSL) Shaheen II at King Abdullah University of Science & Technology (KAUST) in Thuwal Saudi Arabia and the supercomputer Fugaku provided by RIKEN.
Keywords
- Dynamic Runtime Systems
- High Performance Computing
- Numerical Linear Algebra