Abstract
Three parallel algorithms for computing the QR-factorization of a matrix are presented. The discussion is primarily concerned with implementation of these algorithms on a computer that supports tightly coupled parallel processes sharing a large common memory. The three algorithms are a Householder method based upon high-level modules, a Windowed Householder method that avoids fork-join synchronization, and a Pipelined Givens method that is a variant of the data-flow type algorithms offering large enough granularity to mask synchronization costs. Numerical experiments were conducted on the Denelcor HEP computer. The computational results indicate that the Pipelined Givens method is preferred and that this is primarily due to the number of array references required by the various algorithms.
Original language | English |
---|---|
Pages (from-to) | 25-34 |
Number of pages | 10 |
Journal | Parallel Computing |
Volume | 3 |
Issue number | 1 |
DOIs | |
State | Published - Mar 1986 |
Externally published | Yes |
Funding
* An earlier version of this paper appeared in the proceedings of the Eighteenth Annual Hawaii International Conference on System Sciences in January 1985. ** Work supported in part by the Applied Mathematical Sciences subprogram of the Office of Energy Research, U.S. Department of Energy under Contract g-31-109-Eng-33.
Funders | Funder number |
---|---|
Office of Energy Research | |
U.S. Department of Energy | g-31-109-Eng-33 |
Keywords
- Denelcor HEP
- performance analysis