Abstract
Applications structured as parallel task graphs exhibit both data and task parallelism and arise in many domains. Scheduling these applications efficiently on parallel platforms has been a long-standing challenge. In the case of a single homogeneous platform, such as a cluster, results have been obtained both in theory, i.e., guaranteed algorithms, and, in practice, i.e., pragmatic heuristics. Due to task parallelism, these applications are well suited for execution on distributed platforms that span multiple clusters possibly in multiple institutions. However, the only available results in this context are nonguaranteed heuristics. In this paper, we develop a scheduling algorithm, MCGAS, which is applicable to multicluster platforms that are almost homogeneous. Such platforms are often found as large subsets of multicluster platforms. Our novel contribution is that MCGAS computes task allocations so that a (tunable) performance guarantee is provided. Since a performance guarantee does not necessarily imply good average performance in practice, we also compare MCGAS with a recently proposed nonguaranteed algorithm. Using simulation over a wide range of experimental scenarios, we find that MCGAS leads to better average application makespans than its competitor.
Original language | English |
---|---|
Pages (from-to) | 940-952 |
Number of pages | 13 |
Journal | IEEE Transactions on Parallel and Distributed Systems |
Volume | 20 |
Issue number | 7 |
DOIs | |
State | Published - 2009 |
Externally published | Yes |
Funding
aggregating a total of 5,000 CPUs, and is funded by the French ACI Grid incentive of the French Ministry of Research and Education. Each of the nine sites hosts at least one commodity cluster, and the number of processors per cluster ranges from around 100 to around 1,000. The architectures of these processors are AMD Opteron, Intel Xeon, Intel Itanium 2, or PowerPC. The work of H. Casanova is supported in part by the US National Science Foundation under award number 0546688.
Funders | Funder number |
---|---|
French Ministry of Research and Education | |
US National Science Foundation | |
Directorate for Computer and Information Science and Engineering | 0546688 |
Australian Carbon Innovation |
Keywords
- Mixed parallelism
- Multicluster platform
- Parallel task graph scheduling
- Performance guarantee