Least squares solvers for distributed-memory machines with GPU accelerators

Jakub Kurzak, Mark Gates, Ali Charara, Asim Yarkhan, Jack Dongarra

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

2 Scopus citations

Abstract

This work presents an implementation of a linear least squares solver for distributed-memory machines with GPU accelerators, developed as part of the Software for Linear Algebra Targeting Exascale (SLATE) package. From the algorithmic standpoint, the work leverages recent advances in dense linear algebra, specifically the communication-avoiding QR factorization. From the implementation standpoint, the work represents a sharp departure from the traditional conventions established by legacy packages, such as LAPACK and ScaLAPACK. It is based on representing the matrix as a collection of individual tiles, and using batch operations for offloading work to accelerators. The article lays out the principles of the new approach, discusses the implementation details and presents the performance results.

Original languageEnglish
Title of host publicationICS 2019 - International Conference on Supercomputing
PublisherAssociation for Computing Machinery
Pages117-126
Number of pages10
ISBN (Electronic)9781450360791
DOIs
StatePublished - Jun 26 2019
Externally publishedYes
Event33rd ACM International Conference on Supercomputing, ICS 2019, held in conjunction with the Federated Computing Research Conference, FCRC 2019 - Phoenix, United States
Duration: Jun 26 2019 → …

Publication series

NameProceedings of the International Conference on Supercomputing

Conference

Conference33rd ACM International Conference on Supercomputing, ICS 2019, held in conjunction with the Federated Computing Research Conference, FCRC 2019
Country/TerritoryUnited States
CityPhoenix
Period06/26/19 → …

Keywords

  • Distributed memory
  • Least squares
  • Linear algebra

Fingerprint

Dive into the research topics of 'Least squares solvers for distributed-memory machines with GPU accelerators'. Together they form a unique fingerprint.

Cite this