An HPC-Container Based Continuous Integration Tool for Detecting Scaling and Performance Issues in HPC Applications

Jake Tronge, Jieyang Chen, Patricia Grubel, Tim Randles, Rusty Davis, Quincy Wofford, Steven Anaya, Qiang Guan

Research output: Contribution to journalArticlepeer-review

Abstract

Testing is one of the most important steps in software development-it ensures the quality of software. Continuous Integration (CI) is a widely used testing standard that can report software quality to the developer in a timely manner during development progress. Performance, especially scalability, is another key factor for High Performance Computing (HPC) applications. There are many existing profiling and performance tools for HPC applications, but none of these are integrated into CI tools. In this work, we propose BeeSwarm, an HPC container based parallel scaling performance system that can be easily applied to the current CI test environments. BeeSwarm is mainly designed for HPC application developers who need to monitor how their applications can scale on different compute resources. We demonstrate BeeSwarm using three different HPC applications: CoMD, LULESH and NWChem. We utilize GitHub Actions and provision resources from Google Compute Engine. Our results show that BeeSwarm can be used for scalability and performance testing of a variety of HPC applications, allowing developers to monitor application performance over time.

Original languageEnglish
Pages (from-to)156-168
Number of pages13
JournalIEEE Transactions on Services Computing
Volume17
Issue number1
DOIs
StatePublished - Jan 1 2024
Externally publishedYes

Keywords

  • Scalability test
  • cloud computing
  • container
  • continuous integration
  • high performance computing

Fingerprint

Dive into the research topics of 'An HPC-Container Based Continuous Integration Tool for Detecting Scaling and Performance Issues in HPC Applications'. Together they form a unique fingerprint.

Cite this