Toward fine-grained online task characteristics estimation in scientific workflows

  • Rafael Ferreira Da Silva
  • , Gideon Juve
  • , Ewa Deelman
  • , Tristan Glatard
  • , Frédéric Desprez
  • , Douglas Thain
  • , Benjamín Tovar
  • , Miron Livny

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

43 Scopus citations

Abstract

Task characteristics estimations such as runtime, disk space, and memory consumption, are commonly used by scheduling algorithms and resource provisioning techniques to provide successful and efficient work ow executions. These methods assume that accurate estimations are available, but in production systems it is hard to compute such estimates with good accuracy. In this work, we first profile three real scientific workflows collecting fine-grained information such as process I/O, runtime, memory usage, and CPU utilization. We then propose a method to automatically characterize workflow task needs based on these profiles. Our method estimates task runtime, disk space, and memory consumption based on the size of tasks input data. It looks for correlations between the parameters of a dataset, and if no correlation is found, the dataset is divided into smaller subsets by using a clustering technique. Task behavior estimates are done based on the ratio parameter/input data size if they are correlated, or based on the mean value. However, task dependencies in scientific workflows lead to a chain of estimation errors. To correct such errors, we propose an online estimation process based on the MAPE-K loop where task executions are constantly monitored and estimates are updated accordingly. Experiment results show that our online estimation process yields much more accurate predictions than an offline approach, where all task needs are estimated at once.

Original languageEnglish
Title of host publicationProceedings of WORKS 2013
Subtitle of host publication8th Workshop on Workflows in Support of Large-Scale Science - Held in conjunction with SC 2013: The International Conference for High Performance Computing, Networking, Storage and Analysis
PublisherAssociation for Computing Machinery, Inc
Pages58-67
Number of pages10
ISBN (Electronic)9781450325028
DOIs
StatePublished - Nov 17 2013
Externally publishedYes
Event8th Workshop on Workflows in Support of Large-Scale Science, WORKS 2013 - Denver, United States
Duration: Nov 17 2013 → …

Publication series

NameProceedings of WORKS 2013: 8th Workshop on Workflows in Support of Large-Scale Science - Held in conjunction with SC 2013: The International Conference for High Performance Computing, Networking, Storage and Analysis

Conference

Conference8th Workshop on Workflows in Support of Large-Scale Science, WORKS 2013
Country/TerritoryUnited States
CityDenver
Period11/17/13 → …

Keywords

  • MAPE-K loop
  • Online task estimation
  • Scientific workflow
  • Workflow characterization

Fingerprint

Dive into the research topics of 'Toward fine-grained online task characteristics estimation in scientific workflows'. Together they form a unique fingerprint.

Cite this