On the use of containers in high performance computing environments

Subil Abraham, Arnab K. Paul, Redwan Ibne Seraj Khan, Ali R. Butt

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

24 Scopus citations

Abstract

The lightweight nature, application portability, and deployment flexibility of containers is driving their widespread adoption in cloud solutions. Data analysis and deep learning (DL)/machine learning (ML) applications have especially benefited from containerization. As such data analysis is adopted in high performance computing (HPC), the need for container support in HPC has become paramount. However, containers face crucial performance and I/O challenges in HPC. One obstacle is that while there have been HPC containers, such solutions have not been thoroughly investigated, especially from the aspect of their impact on the crucial HPC I/O throughput. To this end, this paper provides a first-of-its-kind empirical analysis of state-of-the-art representative container solutions (Docker, Podman, Singularity, and Charliecloud) in HPC environments. We also explore how containers interact with an HPC parallel file system like Lustre. We present the design of an analysis framework that is deployed on all nodes in an HPC environment, and captures CPU, memory, network, and file I/O statistics from the nodes and the storage system. We are able to garner key insights from our analysis, e.g., Charliecloud outperforms other container solutions in terms of container start-up time, while Singularity and Charliecloud are equivalent in I/O throughput. But this comes at a cost, as Charliecloud invokes the most metadata and I/O operations on the underlying Lustre file system. By identifying such trade-offs and optimization opportunities, we can enhance HPC containers performance and the ML/DL applications that increasingly rely on them.

Original languageEnglish
Title of host publicationProceedings - 2020 IEEE 13th International Conference on Cloud Computing, CLOUD 2020
PublisherIEEE Computer Society
Pages284-293
Number of pages10
ISBN (Electronic)9781728187808
DOIs
StatePublished - Oct 2020
Externally publishedYes
Event13th IEEE International Conference on Cloud Computing, CLOUD 2020 - Virtual, Beijing, China
Duration: Oct 18 2020Oct 24 2020

Publication series

NameIEEE International Conference on Cloud Computing, CLOUD
Volume2020-October
ISSN (Print)2159-6182
ISSN (Electronic)2159-6190

Conference

Conference13th IEEE International Conference on Cloud Computing, CLOUD 2020
Country/TerritoryChina
CityVirtual, Beijing
Period10/18/2010/24/20

Funding

ACKNOWLEDGMENT This work is sponsored in part by the National Science Foundation under grants CCF-1919113, CNS-1405697, CNS-1615411, CNS-1565314/1838271.

Keywords

  • Container Performance
  • HPC Storage and I/O
  • High Performance Computing
  • Parallel File Systems

Fingerprint

Dive into the research topics of 'On the use of containers in high performance computing environments'. Together they form a unique fingerprint.

Cite this