Dense symmetric indefinite factorization on GPU accelerated architectures

Marc Baboulin, Jack Dongarra, Adrien Rémy, Stanimire Tomov, Ichitaro Yamazaki

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

5 Scopus citations

Abstract

We study the performance of dense symmetric indefinite factorizations (Bunch-Kaufman and Aasen’s algorithms) on multicore CPUs with a Graphics Processing Unit (GPU). Though such algorithms are needed in many scientific and engineering simulations, obtaining high performance of the factorization on the GPU is difficult because the pivoting that is required to ensure the numerical stability of the factorization leads to frequent synchronizations and irregular data accesses. As a result, until recently, there has not been any implementation of these algorithms on hybrid CPU/GPU architectures. To improve their performance on the hybrid architecture, we explore different techniques to reduce the expensive communication and synchronization between the CPU and GPU, or on the GPU. We also study the performance of an LDLT factorization with no pivoting combined with the preprocessing technique based on Random Butterfly Transformations. Though such transformations only have probabilistic results on the numerical stability, they avoid the pivoting and obtain a great performance on the GPU.

Original languageEnglish
Title of host publicationParallel Processing and Applied Mathematics - 11th International Conference, PPAM 2015, Revised Selected Papers
EditorsEwa Deelman, Jack Dongarra, Konrad Karczewski, Roman Wyrzykowski, Jacek Kitowski, Kazimierz Wiatr
PublisherSpringer Verlag
Pages86-95
Number of pages10
ISBN (Print)9783319321486
DOIs
StatePublished - 2016
Externally publishedYes
Event11th International Conference on Parallel Processing and Applied Mathematics, PPAM 2015 - Krakow, Poland
Duration: Sep 6 2015Sep 9 2015

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume9573
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference11th International Conference on Parallel Processing and Applied Mathematics, PPAM 2015
Country/TerritoryPoland
CityKrakow
Period09/6/1509/9/15

Funding

The authors would like to thank the NSF grant #ACI-1339822, NVIDIA, and MathWorks for supporting this research effort. The authors are also grateful to Nicolas Zerbib (ESI Group) for his help in using test matrices from acoustics.

FundersFunder number
National Science Foundation-1339822
NVIDIA
MathWorks

    Keywords

    • Communicationavoiding
    • Dense symmetric indefinite factorization
    • GPU computation
    • Randomization

    Fingerprint

    Dive into the research topics of 'Dense symmetric indefinite factorization on GPU accelerated architectures'. Together they form a unique fingerprint.

    Cite this