Accelerating Lattice QCD Multigrid on GPUs Using Fine-Grained Parallelization

M. A. Clark, Balint Joo, Alexei Strelchenko, Michael Cheng, Arjun Gambhir, Richard C. Brower

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

26 Scopus citations

Abstract

The past decade has witnessed a dramatic acceleration of lattice quantum chromodynamics calculations in nuclear and particle physics. This has been due to both significant progress in accelerating the iterative linear solvers using multigrid algorithms, and due to the throughput improvements brought by GPUs. Deploying hierarchical algorithms optimally on GPUs is non-trivial owing to the lack of parallelism on the coarse grids, and as such, these advances have not proved multiplicative. Using the QUDA library, we demonstrate that by exposing all sources of parallelism that the underlying stencil problem possesses, and through appropriate mapping of this parallelism to the GPU architecture, we can achieve high efficiency even for the coarsest of grids. Results are presented for the Wilson-Clover discretization, where we demonstrate up to 10x speedup over present state-of-the-art GPU-accelerated methods on Titan. Finally, we look to the future, and consider the software implications of our findings.

Original languageEnglish
Title of host publicationProceedings of SC 2016
Subtitle of host publicationThe International Conference for High Performance Computing, Networking, Storage and Analysis
PublisherIEEE Computer Society
Pages795-806
Number of pages12
ISBN (Electronic)9781467388153
DOIs
StatePublished - Jul 2 2016
Externally publishedYes
Event2016 International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2016 - Salt Lake City, United States
Duration: Nov 13 2016Nov 18 2016

Publication series

NameInternational Conference for High Performance Computing, Networking, Storage and Analysis, SC
Volume0
ISSN (Print)2167-4329
ISSN (Electronic)2167-4337

Conference

Conference2016 International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2016
Country/TerritoryUnited States
CitySalt Lake City
Period11/13/1611/18/16

Fingerprint

Dive into the research topics of 'Accelerating Lattice QCD Multigrid on GPUs Using Fine-Grained Parallelization'. Together they form a unique fingerprint.

Cite this