Abstract
Dense matrix multiplication is one of the most common numerical operations, especially in the area of dense linear algebra, where it forms the core of many important algorithms, including solvers of linear systems of equations, least square problems, and singular and eigenvalue problems. The Cell B. E. excells in its capabilities to process compute-intensive workloads, like matrix multiplication, in single precision, through its powerful SIMD capabilities. This chapter disects implementations of two single precision matrix multiplication kernels for the SIMD cores of the Cell B. E. (the SPEs), one implementing the C = C - A × BT operation and the other implementing the C = C - A × B operation, for fixed size matrices of 64 × 64 elements. The unique dual-issue architecture of the SPEs provides for a great balance of the floating-point operations and the memory and permutation operations, leading to the utilization of the floating-point pipeline in excess of 99 % in both cases.
Original language | English |
---|---|
Title of host publication | Scientific Computing with Multicore and Accelerators |
Publisher | CRC Press |
Pages | 3-20 |
Number of pages | 18 |
ISBN (Electronic) | 9781439825372 |
ISBN (Print) | 9781439825365 |
DOIs | |
State | Published - Jan 1 2010 |