Communication avoiding 2D stencil implementations over PaRSEC task-based runtime

Yu Pei, Qinglei Cao, George Bosilca, Piotr Luszczek, Victor Eijkhout, Jack Dongarra

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

2 Scopus citations

Abstract

Stencil computation or general sparse matrix-vector product (SpMV) are key components in many algorithms like geometric multigrid or Krylov solvers. But their low arithmetic intensity means that memory bandwidth and network latency will be the performance limiting factors. The current architectural trend favors computations over bandwidth, worsening the already unfavorable imbalance. Previous work approached stencil kernel optimization either by improving memory bandwidth usage or by providing a Communication Avoiding (CA) scheme to minimize network latency in repeated sparse vector multiplication by replicating remote work in order to delay communications on the critical path. Focusing on minimizing communication bottleneck in distributed stencil computation, in this study we combine a CA scheme with the computation and communication overlapping that is inherent in a dataflow task-based runtime system such as PaRSEC to demonstrate their combined benefits. We implemented the 2D five point stencil (Jacobi iteration) in PETSc, and over PaRSEC in two flavors, full communications (base-PaRSEC) and CA-PaRSEC which operate directly on a 2D compute grid. Our results running on two clusters, NaCL and Stampede2 indicate that we can achieve 2X speedup over the standard SpMV solution implemented in PETSc, and in certain cases when kernel execution is not dominating the execution time, the CA-PaRSEC version achieved up to 57% and 33% speedup over base-PaRSEC implementation on NaCL and Stampede2 respectively.

Original languageEnglish
Title of host publicationProceedings - 2020 IEEE 34th International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2020
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages721-729
Number of pages9
ISBN (Electronic)9781728174457
DOIs
StatePublished - May 2020
Externally publishedYes
Event34th IEEE International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2020 - New Orleans, United States
Duration: May 18 2020May 22 2020

Publication series

NameProceedings - 2020 IEEE 34th International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2020

Conference

Conference34th IEEE International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2020
Country/TerritoryUnited States
CityNew Orleans
Period05/18/2005/22/20

Funding

ACKNOWLEDGMENTS This work was supported in part by the National Science Foundation under Grant No. 1740250, and the Exascale Computing Project (17-SC-20-SC), a collaborative effort of the U.S. Department of Energy Office of Science and the National Nuclear Security Administration, under UT Battelle subaward 4000153505.

FundersFunder number
U.S. Department of Energy Office of Science
National Science Foundation17-SC-20-SC, 1740250
Battelle
National Nuclear Security Administration

    Keywords

    • 2D stencil
    • Communication avoiding
    • Parallel programming models

    Fingerprint

    Dive into the research topics of 'Communication avoiding 2D stencil implementations over PaRSEC task-based runtime'. Together they form a unique fingerprint.

    Cite this