Advanced application support for improved GPU utilization on Keeneland

Richard Glassbrook, Jeffrey S. Vetter, Jeffrey Young, M. Graham Lopez, Mitch Horton

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

With the delivery of the Keeneland Full Scale (KFS) system in 20 12, XSEDE gained a new, unique GPU computing resource that contains a large number of GPUs per node. In KFS, each node has three NVIDIA Fermi GPUs, for a total of 792 GPUs and a theoretical peak of 614.5 TFLOPS across 264 nodes. While this system provides the potential for extreme productivity, its unique architecture also requires that each user make full use of all the GPU resources on each allocated node to achieve the best performance. Previous publications [12] have demonstrated a tool that allows for tracking the GPU utilization of individual nodes and the system as a whole, and it has helped to pinpoint low GPU utilization numbers on KFS and its precursor KIDS. This work discusses experiences, strategies, and results that have been applied on the Keeneland Full Scale system to ensure that users are fully utilizing GPU resources and to improve the performance of theft calculations while reducing Service Unit (SU) usage. In many cases, these strategies boil down to two factors: user education and code optimization for KFS's unique architecture. Three specific applications are discussed in this context from the molecular science, materials science, and chemistry domains, and recent application support results are used to illustrate how small interventions can greatly increase utilization on a month-to-month basis.

Original languageEnglish
Title of host publicationProceedings of the XSEDE 2014 Conference
Subtitle of host publicationEngaging Communities
PublisherAssociation for Computing Machinery
ISBN (Print)9781450328937
DOIs
StatePublished - 2014
Event2014 Annual Conference on Extreme Science and Engineering Discovery Environment, XSEDE 2014 - Atlanta, GA, United States
Duration: Jul 13 2014Jul 18 2014

Publication series

NameACM International Conference Proceeding Series

Conference

Conference2014 Annual Conference on Extreme Science and Engineering Discovery Environment, XSEDE 2014
Country/TerritoryUnited States
CityAtlanta, GA
Period07/13/1407/18/14

Keywords

  • Application support
  • GPU
  • Scalable cluster computing
  • Utilization

Fingerprint

Dive into the research topics of 'Advanced application support for improved GPU utilization on Keeneland'. Together they form a unique fingerprint.

Cite this