Dense linear algebra on accelerated multicore hardware

Jack Dongarra, Jakub Kurzak, Piotr Luszczek, Stanimire Tomov

Research output: Chapter in Book/Report/Conference proceedingChapterpeer-review

3 Scopus citations

Abstract

Design of systems exceeding 1 Pflop/s and inching towards 1 Eflop/s forced a dramatic shift in hardware design. Various physical and engineering constraints resulted in introduction of massive parallelism and functional hybridization with the use of accelerator units. This paradigm change brings about a serious challenge for application developers as the management of multicore proliferation and heterogeneity rests on software. And it is reasonable to expect that this situation will not change in the foreseeable future. This chapter presents a methodology of dealing with this issue in three common scenarios. In the context of shared-memory multicore installations, we show how high performance and scalability go hand in hand when the well-known linear algebra algorithms are recast in terms of Direct Acyclic Graphs (DAGs) which are then transparently scheduled at runtime inside the Parallel Linear Algebra Software for Multicore Architectures (PLASMA) project. Similarly, Matrix Algebra on GPU and Multicore Architectures (MAGMA) schedules DAG-driven computations on multicore processors and accelerators. Finally, Distributed PLASMA (DPLASMA), takes the approach to distributed-memory machines with the use of automatic dependence analysis and the Direct Acyclic Graph Engine (DAGuE) to deliver high performance at the scale of many thousands of cores.

Original languageEnglish
Title of host publicationHigh-Performance Scientific Computing
Subtitle of host publicationAlgorithms and Applications
PublisherSpringer-Verlag London Ltd
Pages123-146
Number of pages24
Volume9781447124375
ISBN (Electronic)9781447124375
ISBN (Print)1447124367, 9781447124368
DOIs
StatePublished - Oct 1 2012
Externally publishedYes

Fingerprint

Dive into the research topics of 'Dense linear algebra on accelerated multicore hardware'. Together they form a unique fingerprint.

Cite this