TY - GEN
T1 - Performance analysis of an I/O-intensive workflow executing on Google cloud and Amazon web services
AU - Nawaz, Hassan
AU - Juve, Gideon
AU - Da Silva, Rafael Ferreira
AU - Deelman, Ewa
N1 - Publisher Copyright:
© 2016 IEEE.
PY - 2016/7/18
Y1 - 2016/7/18
N2 - Scientific workflows have become the mainstream to conduct large-scale scientific research. In the meantime, cloud computing has emerged as an alternative computing paradigm. In this paper, we conduct an analysis of the performance of an I/O-intensive real scientific workflow on cloud environments using makespan (the turnaround time for a workflow to complete its execution) as the key performance metric. In particular, we assess the impact of varying the storage configurations on workflow performance when executing on Google Cloud and Amazon Web Services. We aim to understand the performance bottlenecks of the popular cloud-based execution environments. Experimental results show significant differences in application performance for different configurations. They also reveal that Amazon Web Services outperforms Google Cloud with equivalent application and system configurations. We then investigate the root cause of these results using provenance data and by benchmarking disk and network I/O on both infrastructures. Lastly, we also suggest modifications in the standard cloud storage APIs, which will reduce the makespan for I/O-intensive workflows.
AB - Scientific workflows have become the mainstream to conduct large-scale scientific research. In the meantime, cloud computing has emerged as an alternative computing paradigm. In this paper, we conduct an analysis of the performance of an I/O-intensive real scientific workflow on cloud environments using makespan (the turnaround time for a workflow to complete its execution) as the key performance metric. In particular, we assess the impact of varying the storage configurations on workflow performance when executing on Google Cloud and Amazon Web Services. We aim to understand the performance bottlenecks of the popular cloud-based execution environments. Experimental results show significant differences in application performance for different configurations. They also reveal that Amazon Web Services outperforms Google Cloud with equivalent application and system configurations. We then investigate the root cause of these results using provenance data and by benchmarking disk and network I/O on both infrastructures. Lastly, we also suggest modifications in the standard cloud storage APIs, which will reduce the makespan for I/O-intensive workflows.
KW - Cloud Computing
KW - I/O Performance Modeling
KW - Scientific Workflow
UR - https://www.scopus.com/pages/publications/84991632828
U2 - 10.1109/IPDPSW.2016.90
DO - 10.1109/IPDPSW.2016.90
M3 - Conference contribution
AN - SCOPUS:84991632828
T3 - Proceedings - 2016 IEEE 30th International Parallel and Distributed Processing Symposium, IPDPS 2016
SP - 535
EP - 544
BT - Proceedings - 2016 IEEE 30th International Parallel and Distributed Processing Symposium, IPDPS 2016
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 30th IEEE International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2016
Y2 - 23 May 2016 through 27 May 2016
ER -