TY - GEN
T1 - Advanced application support for improved GPU utilization on Keeneland
AU - Glassbrook, Richard
AU - Vetter, Jeffrey S.
AU - Young, Jeffrey
AU - Lopez, M. Graham
AU - Horton, Mitch
PY - 2014
Y1 - 2014
N2 - With the delivery of the Keeneland Full Scale (KFS) system in 20 12, XSEDE gained a new, unique GPU computing resource that contains a large number of GPUs per node. In KFS, each node has three NVIDIA Fermi GPUs, for a total of 792 GPUs and a theoretical peak of 614.5 TFLOPS across 264 nodes. While this system provides the potential for extreme productivity, its unique architecture also requires that each user make full use of all the GPU resources on each allocated node to achieve the best performance. Previous publications [12] have demonstrated a tool that allows for tracking the GPU utilization of individual nodes and the system as a whole, and it has helped to pinpoint low GPU utilization numbers on KFS and its precursor KIDS. This work discusses experiences, strategies, and results that have been applied on the Keeneland Full Scale system to ensure that users are fully utilizing GPU resources and to improve the performance of theft calculations while reducing Service Unit (SU) usage. In many cases, these strategies boil down to two factors: user education and code optimization for KFS's unique architecture. Three specific applications are discussed in this context from the molecular science, materials science, and chemistry domains, and recent application support results are used to illustrate how small interventions can greatly increase utilization on a month-to-month basis.
AB - With the delivery of the Keeneland Full Scale (KFS) system in 20 12, XSEDE gained a new, unique GPU computing resource that contains a large number of GPUs per node. In KFS, each node has three NVIDIA Fermi GPUs, for a total of 792 GPUs and a theoretical peak of 614.5 TFLOPS across 264 nodes. While this system provides the potential for extreme productivity, its unique architecture also requires that each user make full use of all the GPU resources on each allocated node to achieve the best performance. Previous publications [12] have demonstrated a tool that allows for tracking the GPU utilization of individual nodes and the system as a whole, and it has helped to pinpoint low GPU utilization numbers on KFS and its precursor KIDS. This work discusses experiences, strategies, and results that have been applied on the Keeneland Full Scale system to ensure that users are fully utilizing GPU resources and to improve the performance of theft calculations while reducing Service Unit (SU) usage. In many cases, these strategies boil down to two factors: user education and code optimization for KFS's unique architecture. Three specific applications are discussed in this context from the molecular science, materials science, and chemistry domains, and recent application support results are used to illustrate how small interventions can greatly increase utilization on a month-to-month basis.
KW - Application support
KW - GPU
KW - Scalable cluster computing
KW - Utilization
UR - http://www.scopus.com/inward/record.url?scp=84905506231&partnerID=8YFLogxK
U2 - 10.1145/2616498.2616506
DO - 10.1145/2616498.2616506
M3 - Conference contribution
AN - SCOPUS:84905506231
SN - 9781450328937
T3 - ACM International Conference Proceeding Series
BT - Proceedings of the XSEDE 2014 Conference
PB - Association for Computing Machinery
T2 - 2014 Annual Conference on Extreme Science and Engineering Discovery Environment, XSEDE 2014
Y2 - 13 July 2014 through 18 July 2014
ER -