Divide and conquer on hybrid GPU-accelerated multicore systems

Christof Vömel, Stanimire Tomov, Jack Dongarra

Research output: Contribution to journalArticlepeer-review

18 Scopus citations

Abstract

With the raw computing power of graphics processing units (GPUs) being more widely available in commodity multicore systems, there is an imminent need to harness their power for important numerical libraries such as LAPACK. In this paper, we consider the solution of dense symmetric and Hermitian eigenproblems by the LAPACK divide and conquer algorithm on such modern heterogeneous systems. We focus on how to make the best use of the individual strengths of the massively parallel manycore GPUs and multicore CPUs. The resulting algorithm overcomes performance bottlenecks faced by current implementations that are optimized for a homogeneous multicore. On a dual socket quad-core Intel Xeon 2.33 GHz with an NVIDIA GTX 280 GPU, we typically obtain up to about a tenfold improvement in performance for the complete dense problem. The techniques described here thus represent an example of how to develop numerical software to efficiently use heterogeneous architectures. As heterogeneity becomes more common in the architecture design, the significance of and need for this work are expected to grow.

Original languageEnglish
Pages (from-to)C70-C82
JournalSIAM Journal on Scientific Computing
Volume34
Issue number2
DOIs
StatePublished - 2012
Externally publishedYes

Keywords

  • GPU
  • Heterogeneous computing
  • Hybrid architecture
  • LAPACK
  • Multicore
  • Performance
  • Symmetric eigenvalue problem

Fingerprint

Dive into the research topics of 'Divide and conquer on hybrid GPU-accelerated multicore systems'. Together they form a unique fingerprint.

Cite this