Abstract
Today largest and most powerful supercomputers in the world are built on heterogeneous platforms; and using the combined power of multi-core CPUs and GPUs, has had a great impact accelerating large-scale applications. However, on these architectures, parallel algorithms, such as the Fast Fourier Transform (FFT), encounter that inter-processor communication become a bottleneck and limits their scalability. In this paper, we present techniques for speeding up multi-process communication cost during the computation of FFTs, considering hybrid network connections as those expected on upcoming exascale machines. Among our techniques, we present algorithmic tuning, making use of phase diagrams; parametric tuning, using different FFT settings; and MPI distribution tuning based on FFT size and computational resources available. We present several experiments obtained on Summit supercomputer at Oak Ridge National Laboratory, using up to 40,960 IBM Power9 cores and 6,144 NVIDIA V-100 GPUs.
Original language | English |
---|---|
Title of host publication | Proceedings of ExaMPI 2021 |
Subtitle of host publication | Workshop on Exascale MPI, Held in conjunction with SC 2021: The International Conference for High Performance Computing, Networking, Storage and Analysis |
Publisher | Institute of Electrical and Electronics Engineers Inc. |
Pages | 46-53 |
Number of pages | 8 |
ISBN (Electronic) | 9781665411080 |
DOIs | |
State | Published - 2021 |
Event | 2021 Workshop on Exascale MPI, ExaMPI 2021 - St. Louis, United States Duration: Nov 14 2021 → … |
Publication series
Name | Proceedings of ExaMPI 2021: Workshop on Exascale MPI, Held in conjunction with SC 2021: The International Conference for High Performance Computing, Networking, Storage and Analysis |
---|
Conference
Conference | 2021 Workshop on Exascale MPI, ExaMPI 2021 |
---|---|
Country/Territory | United States |
City | St. Louis |
Period | 11/14/21 → … |
Funding
ACKNOWLEDGMENT This research was supported by the Exascale Computing Project (ECP), Project Number: 17-SC-20-SC, a collaborative effort of two DOE organizations (the Office of Science and the National Nuclear Security Administration) responsible for the planning and preparation of a capable exascale ecosystem.
Keywords
- Exascale FFT
- Hybrid systems
- MPI tuning
- Scalability