Abstract
In this paper, we design and evaluate a routine for the efficient generation of block-Jacobi preconditioners on graphics processing units (GPUs). Concretely, to exploit the architecture of the graphics accelerator, we develop a batched Gauss-Jordan elimination CUDA kernel for matrix inversion that embeds an implicit pivoting technique and handles the entire inversion process in the GPU registers. In addition, we integrate extraction and insertion CUDA kernels to rapidly set up the block-Jacobi preconditioner. Our experiments compare the performance of our implementation against a sequence of batched routines from the MAGMA library realizing the inversion via the LU factorization with partial pivoting. Furthermore, we evaluate the costs of different strategies for the block-Jacobi extraction and insertion steps, using a variety of sparse matrices from the SuiteSparse matrix collection. Finally, we assess the efficiency of the complete block-Jacobi preconditioner generation in the context of an iterative solver applied to a set of computational science problems, and quantify its benefits over a scalar Jacobi preconditioner.
Original language | English |
---|---|
Title of host publication | Proceedings of the 8th International Workshop on Programming Models and Applications for Multicores and Manycores, PMAM 2017 |
Editors | Quan Chen, Zhiyi Huang |
Publisher | Association for Computing Machinery, Inc |
Pages | 1-10 |
Number of pages | 10 |
ISBN (Electronic) | 9781450348836 |
DOIs | |
State | Published - Feb 4 2017 |
Event | 8th International Workshop on Programming Models and Applications for Multicores and Manycores, PMAM 2017 - Austin, United States Duration: Feb 5 2017 → … |
Publication series
Name | Proceedings of the 8th International Workshop on Programming Models and Applications for Multicores and Manycores, PMAM 2017 |
---|
Conference
Conference | 8th International Workshop on Programming Models and Applications for Multicores and Manycores, PMAM 2017 |
---|---|
Country/Territory | United States |
City | Austin |
Period | 02/5/17 → … |
Funding
This material is based upon work supported by the U.S. Department of Energy Office of Science, Office of Advanced Scientific Computing Research, Applied Mathematics program under Award Number DE-SC-0010042. G. Flegar and E. S. Quintana- Ortí were supported by project TIN2014-53495-R of the MINECO and FEDER.
Keywords
- Block-Jacobi preconditioner
- Gauss-Jordan elimination
- Graphics processing units (GPUs)
- Iterative methods
- Matrix inversion
- Sparse linear systems