TY - GEN
T1 - Tridiagonalization of a symmetric dense matrix on a GPU cluster
AU - Yamazaki, Ichitaro
AU - Dong, Tingxing
AU - Tomov, Stanimire
AU - Dongarra, Jack
PY - 2013
Y1 - 2013
N2 - Symmetric dense Eigen value problems arise in many scientific and engineering simulations. In this paper, we use GPUs to accelerate its main computational kernel, the tridiagonalization of a dense symmetric matrix on a distributed multicore architecture. We then study the performance of this hybrid message-passing/shared-memory/GPU-computing paradigm on up to 16 compute nodes, each of which consists of 16 Intel Sandy Bridge processors and three NVIDIA GPUs. These studies show that such a hybrid paradigm can exploit the underlying hardware architecture and obtain significant speedups over a flat message-passing paradigm can, and they demonstrate a potential of efficiently solving large-scale Eigen value problems on a GPU cluster. Furthermore, these studies may provide insights on the general effects of such hybrid paradigms on emerging high-performance computers.
AB - Symmetric dense Eigen value problems arise in many scientific and engineering simulations. In this paper, we use GPUs to accelerate its main computational kernel, the tridiagonalization of a dense symmetric matrix on a distributed multicore architecture. We then study the performance of this hybrid message-passing/shared-memory/GPU-computing paradigm on up to 16 compute nodes, each of which consists of 16 Intel Sandy Bridge processors and three NVIDIA GPUs. These studies show that such a hybrid paradigm can exploit the underlying hardware architecture and obtain significant speedups over a flat message-passing paradigm can, and they demonstrate a potential of efficiently solving large-scale Eigen value problems on a GPU cluster. Furthermore, these studies may provide insights on the general effects of such hybrid paradigms on emerging high-performance computers.
KW - GPU cluster
KW - dense symmetric tridiagonalization
KW - distributed multicores
KW - hybrid programming
UR - http://www.scopus.com/inward/record.url?scp=84899748806&partnerID=8YFLogxK
U2 - 10.1109/IPDPSW.2013.265
DO - 10.1109/IPDPSW.2013.265
M3 - Conference contribution
AN - SCOPUS:84899748806
SN - 9780769549798
T3 - Proceedings - IEEE 27th International Parallel and Distributed Processing Symposium Workshops and PhD Forum, IPDPSW 2013
SP - 1070
EP - 1079
BT - Proceedings - IEEE 27th International Parallel and Distributed Processing Symposium Workshops and PhD Forum, IPDPSW 2013
PB - IEEE Computer Society
T2 - 2013 IEEE 37th Annual Computer Software and Applications Conference, COMPSAC 2013
Y2 - 22 July 2013 through 26 July 2013
ER -