Abstract
High performance computing (HPC) users interact with Summit through dedicated gateways, also known as login nodes. The performance and stability of these login nodes can have a significant impact on the user experience. In this study, the performance and stability of Summit’s five login nodes are evaluated by analyzing the log data from 2020 and 2021. The analysis focuses on the computing capability (CPU average load, users and tasks) and the storage performance, along with the associated job scheduler activity. The outcome of this study can serve as the foundation of a predictive modeling framework that enables the system admin of an HPC system to preemptively deploy countermeasures before the onset of a system failure.
Original language | English |
---|---|
Title of host publication | Accelerating Science and Engineering Discoveries Through Integrated Research Infrastructure for Experiment, Big Data, Modeling and Simulation - 22nd Smoky Mountains Computational Sciences and Engineering Conference, SMC 2022, Revised Selected Papers |
Editors | Kothe Doug, Geist Al, Swaroop Pophale, Hong Liu, Suzanne Parete-Koon |
Publisher | Springer Science and Business Media Deutschland GmbH |
Pages | 329-344 |
Number of pages | 16 |
ISBN (Print) | 9783031236051 |
DOIs | |
State | Published - 2022 |
Event | Smoky Mountains Computational Sciences and Engineering Conference, SMC 2022 - Virtual, Online Duration: Aug 24 2022 → Aug 25 2022 |
Publication series
Name | Communications in Computer and Information Science |
---|---|
Volume | 1690 CCIS |
ISSN (Print) | 1865-0929 |
ISSN (Electronic) | 1865-0937 |
Conference
Conference | Smoky Mountains Computational Sciences and Engineering Conference, SMC 2022 |
---|---|
City | Virtual, Online |
Period | 08/24/22 → 08/25/22 |
Funding
Acknowledgements. This research used resources of the Oak Ridge Leadership Computing Facility at the Oak Ridge National Laboratory, which is supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC05-00OR22725. 6th Annual Smoky Mountains Computational Sciences Data Challenge (SMCDC22) This manuscript has been authored by UT-Battelle, LLC, under contract DE-AC05-00OR22725 with the US Department of Energy (DOE). The US government retains and the publisher, by accepting the article for publication, acknowledges that the US government retains a nonexclusive, paid-up, irrevocable, worldwide license to publish or reproduce the published form of this manuscript, or allow others to do so, for US government purposes. DOE will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan (http://energy. gov/downloads/doe-public-access-plan).
Keywords
- HPC
- Performance analysis
- Summit