Lossy all-to-all exchange for accelerating parallel 3-D FFTs on hybrid architectures with GPUs

Sebastien Cayrols, Jiali Li, George Bosilca, Stanimire Tomov, Alan Ayala, Jack Dongarra

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

2 Scopus citations

Abstract

In the context of parallel applications, communication is a critical part of the infrastructure and a potential bottleneck. The traditional approach to tackle communication challenges consists of redesigning algorithms so that the complexity or the communication volume is reduced. However, there are algorithms like the Fast Fourier Transform (FFT) where reducing the volume of communication is very challenging yet can reap large benefit in terms of time-to-completion. In this paper, we revisit the implementation of the MPI all-to-all routine at the core of 3D FFTs by using advanced MPI features, such as One-Sided Communication, and integrate data compression during communication to reduce the volume of data exchanged. Since some compression techniques are 'lossy' in the sense that they involve a loss of accuracy, we study the impact of lossy compression in heFFTe, the state-of-the-art FFT library for large scale 3D FFTs on hybrid architectures with GPUs. Consequently, we design an approximate FFT algorithm that trades off user-controlled accuracy for speed. We show that we speedup the 3D FFTs proportionally to the compression rate. In terms of accuracy, comparing our approach with a reduced precision execution, where both the data and the computation are in reduced precision, we show that when the volume of communication is compressed to the size of the reduced precision data, the approximate FFT algorithm is as fast as the one in reduced precision while the accuracy is one order of magnitude better.

Original languageEnglish
Title of host publicationProceedings - 2022 IEEE International Conference on Cluster Computing, CLUSTER 2022
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages152-160
Number of pages9
ISBN (Electronic)9781665498562
DOIs
StatePublished - 2022
Event2022 IEEE International Conference on Cluster Computing, CLUSTER 2022 - Heidelberg, Germany
Duration: Sep 6 2022Sep 9 2022

Publication series

NameProceedings - IEEE International Conference on Cluster Computing, ICCC
Volume2022-September
ISSN (Print)1552-5244

Conference

Conference2022 IEEE International Conference on Cluster Computing, CLUSTER 2022
Country/TerritoryGermany
CityHeidelberg
Period09/6/2209/9/22

Funding

This research was supported by the Exascale Computing Project (17-SC-20-SC), a collaborative effort of two U.S. Department of Energy organizations (Office of Science and the National Nuclear Security Administration) responsible for the planning and preparation of a capable exascale ecosystem, including software, applications, hardware, advanced system engineering and early tested platforms, in support of the nation’s exascale computing imperative

FundersFunder number
U.S. Department of Energy organizations
National Nuclear Security Administration

    Keywords

    • All to all
    • FFT
    • Lossy compression
    • MPI

    Fingerprint

    Dive into the research topics of 'Lossy all-to-all exchange for accelerating parallel 3-D FFTs on hybrid architectures with GPUs'. Together they form a unique fingerprint.

    Cite this