Abstract
The Exascale Computing Project software deployment effort developed and advanced DevOps capabilities. One goal was to enable robust continuous integration (CI) workflows that span the protected high-performance computing environments found within many of the U.S. Department of Energy's (DOE's) national laboratories. This article highlights several challenges encountered with enabling automation, such as charging models for CI jobs and meeting individualized security requirements that revolve around strongly associating running code with a human identity. It also describes how the Jacamar CI tool evolved to meet later requirements and became a key aspect of the solutions currently offered. Derived from this experience, we offer a conceptual framework for understanding current and future CI challenges at DOE facilities and offer suggestions for long-term solutions.
Original language | English |
---|---|
Pages (from-to) | 31-39 |
Number of pages | 9 |
Journal | Computing in Science and Engineering |
Volume | 26 |
Issue number | 1 |
DOIs | |
State | Published - 2024 |
Funding
This research was supported by the ECP (17-SC-20-SC), a joint project of the U.S. DOE's Office of Science and National Nuclear Security Administration, responsible for delivering a capable exascale ecosystem, including software, applications, and hardware technology, to support the nation's exascale computing imperative. The ECP CI effort consisted of several years of work by many individuals spread out across the DOE as well as academic and industry partners. The authors would like to thank all those who participated and continue to participate in bringing this important capability to the DOE's most advanced machines.
Funders | Funder number |
---|---|
U.S. Department of Energy | |
Office of Science and National Nuclear Security Administration |