TY - JOUR
T1 - Controlling fairness and task granularity in distributed, online, non-clairvoyant workflow executions
AU - Da Silva, Rafael Ferreira
AU - Glatard, Tristan
AU - Desprez, Frédéric
N1 - Publisher Copyright:
Copyright © 2014 John Wiley & Sons, Ltd.
PY - 2014/9/25
Y1 - 2014/9/25
N2 - Distributed computing infrastructures are commonly used for scientific computing, and science gateways provide complete middleware stacks to allow their transparent exploitation by end users. However, administrating such systems manually is time consuming and sub-optimal because of the complexity of the execution conditions. Algorithms and frameworks aiming at automating system administration must deal with online and non-clairvoyant conditions, where most parameters are unknown and evolve over time. We consider the problem of controlling task granularity and fairness among scientific workflows executed in these conditions. We present two self-managing loops monitoring the fineness, coarseness, and fairness of workflow executions, comparing these metrics with thresholds extracted from knowledge acquired in previous executions and planning appropriate actions to maintain these metrics to appropriate ranges. Experiments on the European Grid Infrastructure show that our task granularity control can speed up executions up to a factor of 2 and that our fairness control reduces slowdown variability by 3-7 compared with first-come, first-served. We also study the interaction between granularity control and fairness control: our experiments demonstrate that controlling task granularity degrades fairness but that our fairness control algorithm can compensate this degradation.
AB - Distributed computing infrastructures are commonly used for scientific computing, and science gateways provide complete middleware stacks to allow their transparent exploitation by end users. However, administrating such systems manually is time consuming and sub-optimal because of the complexity of the execution conditions. Algorithms and frameworks aiming at automating system administration must deal with online and non-clairvoyant conditions, where most parameters are unknown and evolve over time. We consider the problem of controlling task granularity and fairness among scientific workflows executed in these conditions. We present two self-managing loops monitoring the fineness, coarseness, and fairness of workflow executions, comparing these metrics with thresholds extracted from knowledge acquired in previous executions and planning appropriate actions to maintain these metrics to appropriate ranges. Experiments on the European Grid Infrastructure show that our task granularity control can speed up executions up to a factor of 2 and that our fairness control reduces slowdown variability by 3-7 compared with first-come, first-served. We also study the interaction between granularity control and fairness control: our experiments demonstrate that controlling task granularity degrades fairness but that our fairness control algorithm can compensate this degradation.
KW - Distributed computing infrastructures
KW - Fairness
KW - Non-clairvoyant conditions
KW - Online conditions
KW - Scientific workflows
KW - Task granularity
UR - http://www.scopus.com/inward/record.url?scp=84923329002&partnerID=8YFLogxK
U2 - 10.1002/cpe.3303
DO - 10.1002/cpe.3303
M3 - Article
AN - SCOPUS:84923329002
SN - 1532-0626
VL - 26
SP - 2347
EP - 2366
JO - Concurrency and Computation: Practice and Experience
JF - Concurrency and Computation: Practice and Experience
IS - 14
ER -