Overlapping computation and communication for advection on hybrid parallel computers

J. B. White, J. J. Dongarra

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

15 Scopus citations

Abstract

We describe computational experiments exploring the performance improvements from overlapping computation and communication on hybrid parallel computers. Our test case is explicit time integration of linear advection with constant uniform velocity in a three-dimensional periodic domain. The test systems include a Cray XT5, a Cray XE6, and two multicore Infiniband clusters with different generations of NVIDIA graphics processing units (GPUs). We describe results for Fortran implementations using various combinations of MPI, OpenMP, and CUDA, with and without overlap of computation and communication. We find that overlapping CPU computation, GPU computation, parallel communication, and CPU-GPU communication can provide performance improvements of more than a factor of two.

Original languageEnglish
Title of host publicationProceedings - 25th IEEE International Parallel and Distributed Processing Symposium, IPDPS 2011
Pages59-67
Number of pages9
DOIs
StatePublished - 2011
Externally publishedYes
Event25th IEEE International Parallel and Distributed Processing Symposium, IPDPS 2011 - Anchorage, AK, United States
Duration: May 16 2011May 20 2011

Publication series

NameProceedings - 25th IEEE International Parallel and Distributed Processing Symposium, IPDPS 2011

Conference

Conference25th IEEE International Parallel and Distributed Processing Symposium, IPDPS 2011
Country/TerritoryUnited States
CityAnchorage, AK
Period05/16/1105/20/11

Keywords

  • CUDA Fortran
  • GPU
  • MPI
  • OpenMP
  • linear advection

Fingerprint

Dive into the research topics of 'Overlapping computation and communication for advection on hybrid parallel computers'. Together they form a unique fingerprint.

Cite this