TY - GEN
T1 - Accelerating S3D
T2 - Workshop on Highly Parallel Processing, Euro-Par 2009
AU - Spafford, Kyle
AU - Meredith, Jeremy
AU - Vetter, Jeffrey
AU - Chen, Jacqueline
AU - Grout, Ray
AU - Sankaran, Ramanan
PY - 2010
Y1 - 2010
N2 - The graphics processor (GPU) has evolved into an appealing choice for high performance computing due to its superior memory bandwidth, raw processing power, and flexible programmability. As such, GPUs represent an excellent platform for accelerating scientific applications. This paper explores a methodology for identifying applications which present significant potential for acceleration. In particular, this work focuses on experiences from accelerating S3D, a high-fidelity turbulent reacting flow solver. The acceleration process is examined from a holistic viewpoint, and includes details that arise from different phases of the conversion. This paper also addresses the issue of floating point accuracy and precision on the GPU, a topic of immense importance to scientific computing. Several performance experiments are conducted, and results are presented from the NVIDIA Tesla C1060 GPU. We generalize from our experiences to provide a roadmap for deploying existing scientific applications on heterogeneous GPU platforms.
AB - The graphics processor (GPU) has evolved into an appealing choice for high performance computing due to its superior memory bandwidth, raw processing power, and flexible programmability. As such, GPUs represent an excellent platform for accelerating scientific applications. This paper explores a methodology for identifying applications which present significant potential for acceleration. In particular, this work focuses on experiences from accelerating S3D, a high-fidelity turbulent reacting flow solver. The acceleration process is examined from a holistic viewpoint, and includes details that arise from different phases of the conversion. This paper also addresses the issue of floating point accuracy and precision on the GPU, a topic of immense importance to scientific computing. Several performance experiments are conducted, and results are presented from the NVIDIA Tesla C1060 GPU. We generalize from our experiences to provide a roadmap for deploying existing scientific applications on heterogeneous GPU platforms.
UR - http://www.scopus.com/inward/record.url?scp=77954617699&partnerID=8YFLogxK
U2 - 10.1007/978-3-642-14122-5_16
DO - 10.1007/978-3-642-14122-5_16
M3 - Conference contribution
AN - SCOPUS:77954617699
SN - 3642141218
SN - 9783642141218
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 122
EP - 131
BT - Euro-Par 2009 Parallel Processing Workshops - HPPC, HeteroPar, PROPER, ROIA, UNICORE, VHPC, Workshops
Y2 - 25 August 2009 through 28 August 2009
ER -