TY - GEN
T1 - The fat-link computation on large GPU clusters for lattice QCD
AU - Shi, Guochun
AU - Babich, Ronald
AU - Clark, Michael A.
AU - Joó, Bálint
AU - Gottlieb, Steven
AU - Kindratenko, Volodymyr
PY - 2012
Y1 - 2012
N2 - Graphics Processing Units (GPU) are becoming increasingly popular in high performance computing due to their high performance, high power efficiency and low cost. In this paper, we present results of an effort to implement the fatlink computation - an important component of many lattice quantum chromodynamics (LQCD) calculations - on GPU clusters using the QUDA framework. Two implementations, one similar to the original CPU algorithm in the MILC code and one based on the idea of reduced communication by redundant computations, are presented and their relative advantages are discussed. In strong-scaling tests on up to 384 GPUs on Longhorn and 256 GPUs on Keeneland GPU clusters, where the CPU core to GPU ratio is 4:1 in both clusters, we achieved up to 11.4x and 8.7x node speedup when running on the two GPU clusters, respectively.
AB - Graphics Processing Units (GPU) are becoming increasingly popular in high performance computing due to their high performance, high power efficiency and low cost. In this paper, we present results of an effort to implement the fatlink computation - an important component of many lattice quantum chromodynamics (LQCD) calculations - on GPU clusters using the QUDA framework. Two implementations, one similar to the original CPU algorithm in the MILC code and one based on the idea of reduced communication by redundant computations, are presented and their relative advantages are discussed. In strong-scaling tests on up to 384 GPUs on Longhorn and 256 GPUs on Keeneland GPU clusters, where the CPU core to GPU ratio is 4:1 in both clusters, we achieved up to 11.4x and 8.7x node speedup when running on the two GPU clusters, respectively.
KW - CUDA
KW - GPU
KW - Lattice QCD
KW - MILC
KW - QUDA
KW - Quantum chromodynamics
UR - http://www.scopus.com/inward/record.url?scp=84870692776&partnerID=8YFLogxK
U2 - 10.1109/SAAHPC.2012.10
DO - 10.1109/SAAHPC.2012.10
M3 - Conference contribution
AN - SCOPUS:84870692776
SN - 9780769548388
T3 - Symposium on Application Accelerators in High-Performance Computing
SP - 1
EP - 10
BT - Proceedings - 2012 Symposium on Application Accelerators in High Performance Computing, SAAHPC 2012
T2 - 2012 Symposium on Application Accelerators in High Performance Computing, SAAHPC 2012
Y2 - 10 July 2012 through 11 July 2012
ER -