Automatically tuned linear algebra software

R. Clint Whaley, Jack J. Dongarra

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

443 Scopus citations

Abstract

This paper describes an approach for the automatic generation and optimization of numerical software for processors with deep memory hierarchies and pipelined functional units. The production of such software for machines ranging from desktop workstations to embedded processors can be a tedious and time consuming process. The work described here can help in automating much of this process. We will concentrate our efforts on the widely used linear algebra kernels called the Basic Linear Algebra Subroutines (BLAS). In particular, the work presented here is for general matrix multiply, DGEMM. However much of the technology and approach developed here can be applied to the other Level 3 BLAS and the general strategy can have an impact on basic linear algebra operations in general and may be extended to other important kernel operations.

Original languageEnglish
Title of host publicationSC 1998 - Proceedings of the ACM/IEEE Conference on Supercomputing
PublisherAssociation for Computing Machinery
ISBN (Electronic)081868707X
DOIs
StatePublished - 1998
Event1998 ACM/IEEE Conference on Supercomputing, SC 1998 - Orlando, United States
Duration: Nov 7 1998Nov 13 1998

Publication series

NameProceedings of the International Conference on Supercomputing
Volume1998-November

Conference

Conference1998 ACM/IEEE Conference on Supercomputing, SC 1998
Country/TerritoryUnited States
CityOrlando
Period11/7/9811/13/98

Keywords

  • BLAS
  • Code generation
  • High performance
  • Linear algebra
  • Optimization
  • Tuning

Fingerprint

Dive into the research topics of 'Automatically tuned linear algebra software'. Together they form a unique fingerprint.

Cite this