Accelerating numerical dense linear algebra calculations with GPUs

Jack Dongarra, Mark Gates, Azzam Haidar, Jakub Kurzak, Piotr Luszczek, Stanimire Tomov, Ichitaro Yamazaki

Research output: Chapter in Book/Report/Conference proceedingChapterpeer-review

81 Scopus citations

Abstract

This chapter presents the current best design and implementation practices for the acceleration of dense linear algebra (DLA) on GPUs. Examples are given with fundamental algorithms-from the matrix-matrix multiplication kernel written in CUDA to the higher level algorithms for solving linear systems, eigenvalue and SVD problems. The implementations are available through the MAGMA library-a redesign for GPUs of the popular LAPACK. To generate the extreme level of parallelism needed for the efficient use of GPUs, algorithms of interest are redesigned and then split into well-chosen computational tasks. The tasks execution is scheduled over the computational components of a hybrid system of multicore CPUs with GPU accelerators using either static scheduling or a light-weight runtime system. The use of light-weight runtime systems keeps scheduling overhead low, similar to static scheduling, while enabling the expression of parallelism through sequential-like code. This simplifies the development effort and allows the exploration of the unique strengths of the various hardware components.

Original languageEnglish
Title of host publicationNumerical Computations with GPUs
PublisherSpringer International Publishing
Pages3-28
Number of pages26
ISBN (Electronic)9783319065489
ISBN (Print)9783319065472
DOIs
StatePublished - Jan 1 2014

Fingerprint

Dive into the research topics of 'Accelerating numerical dense linear algebra calculations with GPUs'. Together they form a unique fingerprint.

Cite this