Abstract
This paper considers key ideas in the design of out-of-core dense LU factorization routines. A left-looking variant of the LU factorization algorithm is shown to require less I/O to disk than the right-looking variant, and is used to develop a parallel, out-of-core implementation. This implementation makes use of a small library of parallel I/O routines, together with ScaLAPACK and PBLAS routines. Results for runs on an Intel Paragon are presented and interpreted using a simple performance model.
| Original language | English |
|---|---|
| Pages (from-to) | 49-70 |
| Number of pages | 22 |
| Journal | Parallel Computing |
| Volume | 23 |
| Issue number | 1-2 |
| DOIs | |
| State | Published - Apr 1997 |
| Externally published | Yes |
Keywords
- LU factorization
- Out-of-core computation
- Parallel I/O
- Parallel computing