Abstract
We present our experience using containers to scale up a massive ensemble of coupled I/O bound workloads on the NERSC Cori supercomputer. We describe the design of a hierarchical simulation structure using the Integrated Plasma Simulator (IPS) that enables the flexible execution of coupled simulations at the system, node, and core level using the same coupling abstraction and API. The hierarchical design allows for the node-level execution to be efficiently executed using containers while not impacting the structure of the simulation at the system level. We demonstrate the viability of the approach by presenting experimental results from applications in coupled fusion plasma simulations that illustrate the performance impact of using containers to deploy the node-level workloads, in conjunction with the user mountable XFS file systems to ameliorate the load on the Lustre parallel file system. We also present results from production runs showing the ability of the ensemble simulations to scale to hundreds of Cori Haswell nodes, with little or no overhead.
Original language | English |
---|---|
Title of host publication | Proceedings of CANOPIE-HPC 2020 |
Subtitle of host publication | 2nd International Workshop on Containers and New Orchestration Paradigms for Isolated Environments in HPC, Held in conjunction with SC 2020: The International Conference for High Performance Computing, Networking, Storage and Analysis |
Publisher | Institute of Electrical and Electronics Engineers Inc. |
Pages | 26-34 |
Number of pages | 9 |
ISBN (Electronic) | 9781665415552 |
DOIs | |
State | Published - Nov 2020 |
Event | 2nd International Workshop on Containers and New Orchestration Paradigms for Isolated Environments in HPC, CANOPIE-HPC 2020 - Virtual, Atlanta, United States Duration: Nov 12 2020 → … |
Publication series
Name | Proceedings of CANOPIE-HPC 2020: 2nd International Workshop on Containers and New Orchestration Paradigms for Isolated Environments in HPC, Held in conjunction with SC 2020: The International Conference for High Performance Computing, Networking, Storage and Analysis |
---|
Conference
Conference | 2nd International Workshop on Containers and New Orchestration Paradigms for Isolated Environments in HPC, CANOPIE-HPC 2020 |
---|---|
Country/Territory | United States |
City | Virtual, Atlanta |
Period | 11/12/20 → … |
Funding
This work has been supported by the U. S. Department of Energy, Offices of Fusion Energy Sciences and Advanced Scientific Computing Research. This work used resources of the National Energy Research Scientific Computing Center (NERSC), a U.S. Department of Energy Office of Science User Facility operated under Contract No. DE-AC02-05CH11231. Oak Ridge National Laboratory (ORNL) is managed by UT-Battelle, LLC for the U. S. Department of Energy under Contract No. DE-AC05-00OR22725.
Keywords
- Containers
- Ensemble
- Fusion
- Python
- Virtual File System
- Workflow