Design and implementation of a large scale tree-based QR decomposition using a 3D virtual systolic array and a lightweight runtime

Ichitaro Yamazaki, Jakub Kurzak, Piotr Luszczek, Jack Dongarra

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Scopus citations

Abstract

A systolic array provides an alternative computing paradigm to the von Neuman architecture. Though its hardware implementation has failed as a paradigm to design integrated circuits in the past, we are now discovering that the systolic array as a software virtualization layer can lead to an extremely scalable execution paradigm. To demonstrate this scalability, in this paper, we design and implement a 3D virtual systolic array to compute a tile QR decomposition of a tall-and-skinny dense matrix. Our implementation is based on a state-of-the-art algorithm that factorizes a panel based on a tree-reduction. Using a runtime developed as a part of the Parallel Ultra Light Systolic Array Runtime (PULSAR) project, we demonstrate on a Cray-XT5 machine how our virtual systolic array can be mapped to a large-scale machine and obtain excellent parallel performance. This is an important contribution since such a QR decomposition is used, for example, to compute a least squares solution of an overdetermined system, which arises in many scientific and engineering problems.

Original languageEnglish
Title of host publicationProceedings - IEEE 28th International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2014
PublisherIEEE Computer Society
Pages1495-1504
Number of pages10
ISBN (Electronic)9780769552088
DOIs
StatePublished - Nov 27 2014
Event28th IEEE International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2014 - Phoenix, United States
Duration: May 19 2014May 23 2014

Publication series

NameProceedings - IEEE 28th International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2014

Conference

Conference28th IEEE International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2014
Country/TerritoryUnited States
CityPhoenix
Period05/19/1405/23/14

Keywords

  • Dataflow
  • Message-passing
  • Multithreading
  • QR decomposition
  • Runtime
  • Systolic array

Fingerprint

Dive into the research topics of 'Design and implementation of a large scale tree-based QR decomposition using a 3D virtual systolic array and a lightweight runtime'. Together they form a unique fingerprint.

Cite this