TY - GEN
T1 - Scheduling parallel tasks under multiple resources
T2 - 32nd IEEE International Parallel and Distributed Processing Symposium, IPDPS 2018
AU - Sun, Hongyang
AU - Elghazi, Redouane
AU - Gainaru, Ana
AU - Aupy, Guillaume
AU - Raghavan, Padma
N1 - Publisher Copyright:
© 2018 IEEE.
PY - 2018/8/3
Y1 - 2018/8/3
N2 - Scheduling in High-Performance Computing (HPC) has been traditionally centered around computing resources (e.g., processors/cores). The ever-growing amount of data produced by modern scientific applications start to drive novel architectures and new computing frameworks to support more efficient data processing, transfer and storage for future HPC systems. This trend towards data-driven computing demands the scheduling solutions to also consider other resources (e.g., I/O, memory, cache) that can be shared amongst competing applications. In this paper, we study the problem of scheduling HPC applications while exploring the availability of multiple types of resources that could impact their performance. The goal is to minimize the overall execution time, or makespan, for a set of moldable tasks under multiple-resource constraints. Two scheduling paradigms, namely, list scheduling and pack scheduling, are compared through both theoretical analyses and experimental evaluations. Theoretically, we prove, for several algorithms falling in the two scheduling paradigms, tight approximation ratios that increase linearly with the number of resource types. As the complexity of direct solutions grows exponentially with the number of resource types, we also design a strategy to indirectly solve the problem via a transformation to a single-resource-Type problem, which can significantly reduce the algorithms' running times without compromising their approximation ratios. Experiments conducted on Intel Knights Landing with two resource types (processor cores and high-bandwidth memory) and simulations designed on more resource types confirm the benefit of the transformation strategy and show that pack-based scheduling, despite having a worse theoretical bound, offers a practically promising and easy-To-implement solution, especially when more resource types need to be managed.
AB - Scheduling in High-Performance Computing (HPC) has been traditionally centered around computing resources (e.g., processors/cores). The ever-growing amount of data produced by modern scientific applications start to drive novel architectures and new computing frameworks to support more efficient data processing, transfer and storage for future HPC systems. This trend towards data-driven computing demands the scheduling solutions to also consider other resources (e.g., I/O, memory, cache) that can be shared amongst competing applications. In this paper, we study the problem of scheduling HPC applications while exploring the availability of multiple types of resources that could impact their performance. The goal is to minimize the overall execution time, or makespan, for a set of moldable tasks under multiple-resource constraints. Two scheduling paradigms, namely, list scheduling and pack scheduling, are compared through both theoretical analyses and experimental evaluations. Theoretically, we prove, for several algorithms falling in the two scheduling paradigms, tight approximation ratios that increase linearly with the number of resource types. As the complexity of direct solutions grows exponentially with the number of resource types, we also design a strategy to indirectly solve the problem via a transformation to a single-resource-Type problem, which can significantly reduce the algorithms' running times without compromising their approximation ratios. Experiments conducted on Intel Knights Landing with two resource types (processor cores and high-bandwidth memory) and simulations designed on more resource types confirm the benefit of the transformation strategy and show that pack-based scheduling, despite having a worse theoretical bound, offers a practically promising and easy-To-implement solution, especially when more resource types need to be managed.
KW - HPC
KW - Multi resource
KW - Scheduling
UR - http://www.scopus.com/inward/record.url?scp=85052198345&partnerID=8YFLogxK
U2 - 10.1109/IPDPS.2018.00029
DO - 10.1109/IPDPS.2018.00029
M3 - Conference contribution
AN - SCOPUS:85052198345
SN - 9781538643686
T3 - Proceedings - 2018 IEEE 32nd International Parallel and Distributed Processing Symposium, IPDPS 2018
SP - 194
EP - 203
BT - Proceedings - 2018 IEEE 32nd International Parallel and Distributed Processing Symposium, IPDPS 2018
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 21 May 2018 through 25 May 2018
ER -