TY - GEN
T1 - Multi - FFT vectorization for the cell multicore processor
AU - Barhen, J.
AU - Humble, T.
AU - Mitra, P.
AU - Traweek, M.
PY - 2010
Y1 - 2010
N2 - The emergence of streaming multicore processors with multi-SIMD architectures and ultra-low power operation combined with real-time compute and I/O reconfigurability opens unprecedented opportunities for executing sophisticated signal processing algorithms faster and within a much lower energy budget. Here, we present an unconventional FFT implementation scheme for the IBM Cell, named transverse vectorization. It is shown to outperform (both in terms of timing or GFLOP throughput) the fastest FFT results reported to date in the open literature.
AB - The emergence of streaming multicore processors with multi-SIMD architectures and ultra-low power operation combined with real-time compute and I/O reconfigurability opens unprecedented opportunities for executing sophisticated signal processing algorithms faster and within a much lower energy budget. Here, we present an unconventional FFT implementation scheme for the IBM Cell, named transverse vectorization. It is shown to outperform (both in terms of timing or GFLOP throughput) the fastest FFT results reported to date in the open literature.
KW - FFT
KW - IBM cell
KW - Multicore processors
KW - Transverse vectorization
UR - http://www.scopus.com/inward/record.url?scp=77954924907&partnerID=8YFLogxK
U2 - 10.1109/CCGRID.2010.78
DO - 10.1109/CCGRID.2010.78
M3 - Conference contribution
AN - SCOPUS:77954924907
SN - 9781424469871
T3 - CCGrid 2010 - 10th IEEE/ACM International Conference on Cluster, Cloud, and Grid Computing
SP - 780
EP - 785
BT - CCGrid 2010 - 10th IEEE/ACM International Conference on Cluster, Cloud, and Grid Computing
T2 - 10th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing, CCGrid 2010
Y2 - 17 May 2010 through 20 May 2010
ER -