Abstract
Scientific productivity can be enhanced through workflow management tools, relieving large High Performance Computing (HPC) system users from the tedious tasks of scheduling and designing the complex computational execution of scientific applications. This paper presents a study on the usage of ensemble workflow tools to accelerate science using the Summit and Frontier supercomputing systems. The research aims to connect science domain simulations using Oak Ridge Leadership Computing Facility (OLCF) supercomputing platforms with ensemble workflow methods in order to accelerate HPC-enabled discovery and boost scientific impact. We present the coupling, porting and optimization of Radical-Cybertools on three applications: Chroma, NAMD and LAMMPS. The tools augment traditional HPC monolithic runs with a pilot scheduler. Lessons-learned are discussed for physics, biology and materials science applications. We discuss intrinsic limitations of coupling and porting ensemble workflow tools to applications that run on large HPC systems. The origins of technical challenges and their solutions developed during the implementation process are discussed. Data management strategies, OLCF's policies for ensembles, and natively supported workflow tools are also summarized.1
| Original language | English |
|---|---|
| Title of host publication | Proceedings of SC 2024-W |
| Subtitle of host publication | Workshops of the International Conference for High Performance Computing, Networking, Storage and Analysis |
| Publisher | Institute of Electrical and Electronics Engineers Inc. |
| Pages | 394-401 |
| Number of pages | 8 |
| ISBN (Electronic) | 9798350355543 |
| DOIs | |
| State | Published - 2024 |
| Event | 2024 Workshops of the International Conference for High Performance Computing, Networking, Storage and Analysis, SC Workshops 2024 - Atlanta, United States Duration: Nov 17 2024 → Nov 22 2024 |
Publication series
| Name | Proceedings of SC 2024-W: Workshops of the International Conference for High Performance Computing, Networking, Storage and Analysis |
|---|
Conference
| Conference | 2024 Workshops of the International Conference for High Performance Computing, Networking, Storage and Analysis, SC Workshops 2024 |
|---|---|
| Country/Territory | United States |
| City | Atlanta |
| Period | 11/17/24 → 11/22/24 |
Funding
The authors thank the insightful support and contributions the Radical-Cybertools team and Shantenu Jha, Matteo Turilli and Mikhail Titov. The authors would, also, like to thank Professor Peter Coveney, Balint Joo and Nick Hagerty for their contribution to this work. This research used resources of the Oak Ridge Leadership Computing Facility at the Oak Ridge National Laboratory, which is supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC05-00OR22725.