Abstract
The fast Fourier transform (FFT), is one the most important tools in mathematics, and it is widely required by several applications of science and engineering. State-of-the-art parallel implementations of the FFT algorithm, based on Cooley-Tukey developments, are known to be communication-bound, which causes critical issues when scaling the computational and architectural capabilities. In this paper, we study the main performance bottleneck of FFT computations on hybrid CPU and GPU systems at large-scale. We provide numerical simulations and potential acceleration techniques that can be easily integrated into FFT distributed libraries. We present different experiments on performance scalability and runtime analysis on the world’s most powerful supercomputers today: Summit, using up to 6,144 NVIDIA V100 GPUs, and Fugaku, using more than one million Fujitsu A64FX cores.
Original language | English |
---|---|
Title of host publication | Parallel Computing Technologies - 16th International Conference, PaCT 2021, Proceedings |
Editors | Victor Malyshkin |
Publisher | Springer Science and Business Media Deutschland GmbH |
Pages | 279-287 |
Number of pages | 9 |
ISBN (Print) | 9783030863586 |
DOIs | |
State | Published - 2021 |
Event | 16th International Conference on Parallel Computing Technologies, PaCT 2021 - Kaliningrad, Russian Federation Duration: Sep 13 2021 → Sep 18 2021 |
Publication series
Name | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
---|---|
Volume | 12942 LNCS |
ISSN (Print) | 0302-9743 |
ISSN (Electronic) | 1611-3349 |
Conference
Conference | 16th International Conference on Parallel Computing Technologies, PaCT 2021 |
---|---|
Country/Territory | Russian Federation |
City | Kaliningrad |
Period | 09/13/21 → 09/18/21 |
Funding
This research was supported by the Exascale Computing Project (ECP), Project Number: 17-SC-20-SC, a collaborative effort of two DOE organizations (the Office of Science and the National Nuclear Security Administration) responsible for the planning and preparation of a capable exascale ecosystem.
Keywords
- Hybrid systems
- Parallel FFT
- Scalability