TY - GEN
T1 - GPU-accelerated asynchronous error correction for mixed precision iterative refinement
AU - Anzt, Hartwig
AU - Luszczek, Piotr
AU - Dongarra, Jack
AU - Heuveline, Vincent
PY - 2012
Y1 - 2012
N2 - In hardware-aware high performance computing, block- asynchronous iteration and mixed precision iterative refinement are two techniques that may be used to leverage the computing power of SIMD accelerators like GPUs in the iterative solution of linear equation systems. Although they use a very different approach for this purpose, they share the basic idea of compensating the convergence properties of an inferior numerical algorithm by a more efficient usage of the provided computing power. In this paper, we analyze the potential of combining both techniques. Therefore, we derive a mixed precision iterative refinement algorithm using a block-asynchronous iteration as an error correction solver, and compare its performance with a pure implementation of a block-asynchronous iteration and an iterative refinement method using double precision for the error correction solver. For matrices from the University of Florida Matrix collection, we report the convergence behaviour and provide the total solver runtime using different GPU architectures.
AB - In hardware-aware high performance computing, block- asynchronous iteration and mixed precision iterative refinement are two techniques that may be used to leverage the computing power of SIMD accelerators like GPUs in the iterative solution of linear equation systems. Although they use a very different approach for this purpose, they share the basic idea of compensating the convergence properties of an inferior numerical algorithm by a more efficient usage of the provided computing power. In this paper, we analyze the potential of combining both techniques. Therefore, we derive a mixed precision iterative refinement algorithm using a block-asynchronous iteration as an error correction solver, and compare its performance with a pure implementation of a block-asynchronous iteration and an iterative refinement method using double precision for the error correction solver. For matrices from the University of Florida Matrix collection, we report the convergence behaviour and provide the total solver runtime using different GPU architectures.
KW - GPU
KW - block-asynchronous iteration
KW - linear system
KW - mixed precision iterative refinement
KW - relaxation
UR - http://www.scopus.com/inward/record.url?scp=84867626599&partnerID=8YFLogxK
U2 - 10.1007/978-3-642-32820-6_89
DO - 10.1007/978-3-642-32820-6_89
M3 - Conference contribution
AN - SCOPUS:84867626599
SN - 9783642328190
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 908
EP - 919
BT - Parallel Processing - 18th International Conference, Euro-Par 2012, Proceedings
T2 - 18th International Conference on Parallel Processing, Euro-Par 2012
Y2 - 27 August 2012 through 31 August 2012
ER -