Design and implementation of the parallel out-of-core ScaLAPACK LU, QR, and Cholesky factorization routines

Eduardo D'Azevedo, Jack Dongarra

Research output: Contribution to journalArticlepeer-review

41 Scopus citations

Abstract

This paper describes the design and implementation of three core factorization routines - LU, QR, and Cholesky - included in the out-of-core extension of ScaLAPACK. These routines allow the factorization and solution of a dense system that is too large to fit entirely in physical memory. The full matrix is stored on disk and the factorization routines transfer sub-matrice panels into memory. The 'left-looking' column-oriented variant of the factorization algorithm is implemented to reduce the disk I/O traffic. The routines are implemented using a portable I/O interface and utilize high-performance ScaLAPACK factorization routines as in-core computational kernels. We present the details of the implementation for the out-of-core ScaLAPACK factorization routines, as well as performance and scalability results on a Beowulf Linux cluster.

Original languageEnglish
Pages (from-to)1481-1493
Number of pages13
JournalConcurrency Practice and Experience
Volume12
Issue number15
DOIs
StatePublished - Dec 25 2000
Externally publishedYes

Fingerprint

Dive into the research topics of 'Design and implementation of the parallel out-of-core ScaLAPACK LU, QR, and Cholesky factorization routines'. Together they form a unique fingerprint.

Cite this