Abstract
We provide an overview of the software engineering efforts and their impact in QMCPACK, a production-level ab-initio Quantum Monte Carlo open-source code targeting high-performance computing (HPC) systems. Aspects included are: (i) strategic expansion of continuous integration (CI) targeting CPUs, using GitHub Actions own runners, and NVIDIA and AMD GPUs used in pre-exascale systems, (ii) incremental reduction of memory leaks using sanitizers, (iii) incorporation of Docker containers for CI and reproducibility, and (iv) refactoring efforts to improve maintainability, testing coverage, and memory lifetime management. We quantify the value of these improvements by providing metrics to illustrate the shift towards a predictive, rather than reactive, maintenance approach. Our goal, in documenting the impact of these efforts on QMCPACK, is to contribute to the body of knowledge on the importance of research software engineering (RSE) for the stewardship and advancement of community HPC codes to enable scientific discovery at scale.
Original language | English |
---|---|
Article number | 107502 |
Journal | Future Generation Computer Systems |
Volume | 163 |
DOIs | |
State | Published - Feb 2025 |
Keywords
- CI
- High-performance computing
- Memory safety
- QMCPACK
- Software engineering