ECP libraries and tools: An overview

Michael A. Heroux, Lois Curfman McInnes, James Ahrens, Todd Gamblin, Timothy C. Germann, Xiaoye Sherry Li, Kathryn Mohror, Todd Munson, Sameer Shende, Rajeev Thakur, Jeffrey Vetter, James Willenbring

Research output: Contribution to journalArticlepeer-review

Abstract

The Exascale Computing Project (ECP) Software Technology and Co-Design teams addressed the growing complexities in high-performance computing (HPC) by developing scalable software libraries and tools that leverage exascale system capabilities. As we enter the exascale era, the need for reusable, optimized software solutions that can handle the unique challenges posed by these systems becomes increasingly important. The primary challenges the ECP teams faced were to create software libraries and tools that are performant on exascale architectures and portable and usable across diverse hardware platforms. Efforts addressed issues related to concurrent execution, memory management, and the integration of heterogeneous computing resources, such as GPUs from multiple vendors. The ECP’s strategy involved a structured development process encompassing the creation, optimization, and deployment of software in collaboration with industry, academia, and national laboratories. The project was organized into several technical areas: co-design of domain-specific suites with target applications, programming models and runtimes, development tools, mathematical libraries, data and visualization tools, and software ecosystem and delivery mechanisms. ECP has successfully developed a large portfolio of software libraries and tools that demonstrate significant improvements in performance and scalability on exascale systems. These products have been integrated into the Department of Energy’s computing facilities, supporting various scientific applications and ensuring robust performance across different hardware setups. ECP advancements in software development for exascale computing highlight the importance of a collaborative and adaptive approach to handling next-generation HPC systems complexities. The lessons learned emphasize the need for continuous engagement with end-users and vendors, and the importance of maintaining a balance between innovation and practical implementation. Future efforts will focus on ensuring scalability, keeping pace with rapid hardware advancements, and further enhancing the interoperability and usability of the software ecosystem. Subsequent articles in this special issue provide in-depth discussions and case studies into specific library and tool efforts.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the Exascale Computing Project (17-SC-20-SC), a collaborative effort of the U.S. Department of Energy Office of Science and the National Nuclear Security Administration.

Keywords

  • Exascale computing project
  • libraries
  • software
  • software ecosystem
  • tools

Fingerprint

Dive into the research topics of 'ECP libraries and tools: An overview'. Together they form a unique fingerprint.

Cite this