Skip to main navigation Skip to search Skip to main content

Providing Thermal Stability for an Exascale Supercomputer: A Case Study of Frontier's Cooling System

  • David Grant
  • , Luca Bortot
  • , Chris Deprater
  • , Dave Martinez
  • , Ryan E. Grant
  • , Natalie Bates

    Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

    Abstract

    High performance computing (HPC) systems frequently produce large dynamic power swings, even under typical operating conditions, that can present a significant challenge for their direct-liquid cooling systems. Further, the primary cooling loops that must remove this waste heat have response times measured in minutes while the underlying HPC component thermal stress is measured in seconds. The per-socket power demand for both compute processing units (CPUs) and graphic processing units (GPUs) continues to increase with each successive generation while case temperatures are declining. New HPC systems are expected to exacerbate the challenge of these dynamic power swings and the impact on effective and timely cooling systems. This paper describes the cooling and controls system for Oak Ridge National Laboratory's Frontier Supercomputer, the first sustained exascale system, as a case study for this situation. The cooling and control system for Frontier demonstrates specific success, but with a number of trade-offs and decisions that suggest further design and operating optimizations for the community at large to consider.

    Original languageEnglish
    Title of host publicationProceedings of Supercomputing Asia and International Conference on High Performance Computing in Asia Pacific Region Workshops, SCA/HPCAsia 2026 Workshops
    PublisherAssociation for Computing Machinery, Inc
    Pages69-78
    Number of pages10
    ISBN (Electronic)9798400723285
    DOIs
    StatePublished - Jan 25 2026
    EventSupercomputing Asia and International Conference on High Performance Computing in Asia Pacific Region Workshops, SCA/HPCAsia 2026 Workshops - Osaka, Japan
    Duration: Jan 26 2026Jan 29 2026

    Publication series

    NameProceedings of Supercomputing Asia and International Conference on High Performance Computing in Asia Pacific Region Workshops, SCA/HPCAsia 2026 Workshops

    Conference

    ConferenceSupercomputing Asia and International Conference on High Performance Computing in Asia Pacific Region Workshops, SCA/HPCAsia 2026 Workshops
    Country/TerritoryJapan
    CityOsaka
    Period01/26/2601/29/26

    Keywords

    • Exascale
    • HPC
    • Supercomputing
    • cooling
    • liquid cooling

    Fingerprint

    Dive into the research topics of 'Providing Thermal Stability for an Exascale Supercomputer: A Case Study of Frontier's Cooling System'. Together they form a unique fingerprint.

    Cite this