Variable-Size Batched LU for Small Matrices and Its Integration into Block-Jacobi Preconditioning

Hartwig Anzt, Jack Dongarra, Goran Flegar, Enrique S. Quintana-Orti

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

10 Scopus citations

Abstract

We present a set of new batched CUDA kernels for the LU factorization of a large collection of independent problems of different size, and the subsequent triangular solves. All kernels heavily exploit the registers of the graphics processing unit (GPU) in order to deliver high performance for small problems. The development of these kernels is motivated by the need for tackling this embarrasingly-parallel scenario in the context of block-Jacobi preconditioning that is relevant for the iterative solution of sparse linear systems.

Original languageEnglish
Title of host publicationProceedings - 46th International Conference on Parallel Processing, ICPP 2017
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages91-100
Number of pages10
ISBN (Electronic)9781538610428
DOIs
StatePublished - Sep 1 2017
Event46th International Conference on Parallel Processing, ICPP 2017 - Bristol, United Kingdom
Duration: Aug 14 2017Aug 17 2017

Publication series

NameProceedings of the International Conference on Parallel Processing
ISSN (Print)0190-3918

Conference

Conference46th International Conference on Parallel Processing, ICPP 2017
Country/TerritoryUnited Kingdom
CityBristol
Period08/14/1708/17/17

Funding

This material is based upon work supported by the U.S. Department of Energy Office of Science, Office of Advanced Scientific Computing Research, Applied Mathematics program under Award Number DE-SC-0010042. H. Anzt was supported by the “Impuls und Vernetzungs-fond” of the Helmholtz Association. G. Flegar and E. S. Quintana-Ortí were supported by project TIN2014-53495-R of the MINECO, FEDER, and the EU H2020 project

Keywords

  • Block-Jacobi
  • GPU
  • Variable-size batched LU

Fingerprint

Dive into the research topics of 'Variable-Size Batched LU for Small Matrices and Its Integration into Block-Jacobi Preconditioning'. Together they form a unique fingerprint.

Cite this