Improving performance of GMRES by reducing communication and pipelining global collectives

Ichitaro Yamazaki, Mark Hoemmen, Piotr Luszczek, Jack Dongarra

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

21 Scopus citations

Abstract

We compare the performance of pipelined and s-step GMRES, respectively referred to as l-GMRES and s-GMRES, on distributed multicore CPUs. Compared to standard GMRES, s-GMRES requires fewer all-reduces, while l-GMRES overlaps the all-reduces with computation. To combine the best features of two algorithms, we propose another variant, (l, t)-GMRES, that not only does fewer global all-reduces than standard GMRES, but also overlaps those all-reduces with other work. We implemented the thread-parallelism and communication-overlap in two different ways. The first uses nonblocking MPI collectives with thread-parallel computational kernels. The second relies on a shared-memory task scheduler. In our experiments, (l, t)-GMRES performed better than l-GMRES by factors of up to 1.67×. In addition, though we only used 50 nodes, when the latency cost became significant, our variant performed up to 1.22× better than s-GMRES by hiding all-reduces.

Original languageEnglish
Title of host publicationProceedings - 2017 IEEE 31st International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2017
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages1118-1127
Number of pages10
ISBN (Electronic)9781538634080
DOIs
StatePublished - Jun 30 2017
Externally publishedYes
Event31st IEEE International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2017 - Orlando, United States
Duration: May 29 2017Jun 2 2017

Publication series

NameProceedings - 2017 IEEE 31st International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2017

Conference

Conference31st IEEE International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2017
Country/TerritoryUnited States
CityOrlando
Period05/29/1706/2/17

Funding

We thank Xi Luo at the University of Tennessee for helpful discussions on the non-blocking all-reduce communication. This research was supported in part by the U.S. Department of Energy Office of Science under Award Numbers DE-FG0213ER26137 and DE-SC0010042, and the U.S. National Science Foundation under Award Number 1339822. Sandia National Laboratories is a multi-mission laboratory managed and operated by Sandia Corporation, a wholly owned subsidiary of Lockheed Martin Corporation, for the U.S. Department of Energys National Nuclear Security Administration under contract DE-AC04-94AL85000.

Fingerprint

Dive into the research topics of 'Improving performance of GMRES by reducing communication and pipelining global collectives'. Together they form a unique fingerprint.

Cite this