HPC Storage Service Autotuning Using Variational- Autoencoder -Guided Asynchronous Bayesian Optimization

Matthieu Dorier, Romain Egele, Prasanna Balaprakash, Jaehoon Koo, Sandeep Madireddy, Srinivasan Ramesh, Allen D. Malony, Rob Ross

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

8 Scopus citations

Abstract

Distributed data storage services tailored to specific applications have grown popular in the high-performance computing (HPC) community as a way to address I/O and storage challenges. These services offer a variety of specific interfaces, semantics, and data representations. They also expose many tuning parameters, making it difficult for their users to find the best configuration for a given workload and platform. To address this issue, we develop a novel variational-autoencoder-guided asynchronous Bayesian optimization method to tune HPC storage service parameters. Our approach uses transfer learning to leverage prior tuning results and use a dynamically updated surrogate model to explore the large parameter search space in a systematic way. We implement our approach within the DeepHyper open-source framework, and apply it to the autotuning of a high-energy physics workflow on Argonne's Theta supercomputer. We show that our transfer-learning approach enables a more than 40 x search speedup over random search, compared with a 2.5 x to 10 x speedup when not using transfer learning. Additionally, we show that our approach is on par with state-of-the-art autotuning frameworks in speed and outperforms them in resource utilization and parallelization capabilities.

Original languageEnglish
Title of host publicationProceedings - 2022 IEEE International Conference on Cluster Computing, CLUSTER 2022
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages381-393
Number of pages13
ISBN (Electronic)9781665498562
DOIs
StatePublished - 2022
Externally publishedYes
Event2022 IEEE International Conference on Cluster Computing, CLUSTER 2022 - Heidelberg, Germany
Duration: Sep 6 2022Sep 9 2022

Publication series

NameProceedings - IEEE International Conference on Cluster Computing, ICCC
Volume2022-September
ISSN (Print)1552-5244

Conference

Conference2022 IEEE International Conference on Cluster Computing, CLUSTER 2022
Country/TerritoryGermany
CityHeidelberg
Period09/6/2209/9/22

Funding

We thank the authors of GPtune (in particular Yang Liu and Younghyun Cho) and HiPerBOt (in particular Harshitha Menon) for their time and valuable explanation of their respective software, and for making them available for comparison. We also thank Gail Pieper for proofreading and editing this paper. This material is based upon work supported by the U.S. Department of Energy, Office of Science, Advanced Scientific Computing Research, under Contract DE-AC02-06CH11357.

Keywords

  • Autotuning
  • Bayesian Optimization
  • DeepHyper
  • HPC
  • I/O
  • Mochi
  • Storage
  • Transfer Learning

Fingerprint

Dive into the research topics of 'HPC Storage Service Autotuning Using Variational- Autoencoder -Guided Asynchronous Bayesian Optimization'. Together they form a unique fingerprint.

Cite this