Solving systems of linear equations on the CELL processor using Cholesky factorization

Jakub Kurzak, Alfredo Buttari, Jack Dongarra

Research output: Contribution to journalArticlepeer-review

73 Scopus citations

Abstract

The Sony/Toshiba/IBM (STI) CELL processor introduces pioneering solutions in processor architecture. At the same time it presents new challenges for the development of numerical algorithms. One is effective exploitation of the differential between the speed of single and double precision arithmetic; the other is efficient parallelization between the short vector SIMD cores. The first challenge is addressed by utilizing the well known technique of iterative refinement for the solution of a dense symmetric positive definite system of linear equations, resulting in a mixed-precision algorithm, which delivers double precision accuracy, while performing the bulk of the work in single precision. The main contribution of this paper lies in addressing the second challenge by successful thread-level parallelization, exploiting fine-grained task granularity and a lightweight decentralized synchronization. The implementation of the computationally intensive sections gets within 90 percent of peak floating point performance, while the implementation of the memory intensive sections reaches within 90 percent of peak memory bandwidth. On a single CELL processor, the algorithm achieves over 170 Gflops when solving a symmetric positive definite system of linear equation in single precision and over 150 Gflops when delivering the result in double precision accuracy.

Original languageEnglish
Pages (from-to)1175-1185
Number of pages11
JournalIEEE Transactions on Parallel and Distributed Systems
Volume19
Issue number9
DOIs
StatePublished - 2008
Externally publishedYes

Funding

The authors thank Gary Rancourt and Kirk Jordan at IBM for taking care of their hardware needs and arranging for partial financial support for this work. The authors are thankful to numerous IBM researchers for generously sharing their CELL expertise, in particular Sidney Manning, Daniel Brokenshire, Mike Kistler, Gordon Fossum, Thomas Chen, and Michael Perrone. This work was supported in part by grants from the US National Science Foundation (NSF) and US Department of Energy (DoE).

Keywords

  • CELL broadband engine
  • Numerical Linear Algebra
  • Parallel algorithms

Fingerprint

Dive into the research topics of 'Solving systems of linear equations on the CELL processor using Cholesky factorization'. Together they form a unique fingerprint.

Cite this