TY - GEN
T1 - Performance portability of a GPU enabled factorization with the DAGuE framework
AU - Bosilca, George
AU - Bouteiller, Aurelien
AU - Herault, Thomas
AU - Lemarinier, Pierre
AU - Saengpatsa, Narapat Ohm
AU - Tomov, Stanimire
AU - Dongarra, Jack J.
PY - 2011
Y1 - 2011
N2 - Performance portability is a major challenge faced today by developers on heterogeneous high performance computers, consisting of an interconnect, memory with non-uniform access, many-cores and accelerators like GPUs. Recent studies have successfully demonstrated that dense linear algebra operations can be efficiently handled by runtime systems using a DAG representation. In this work, we present the GPU subsystem of the DAGuE runtime, and assess, on the Cholesky factorization test case, the minimal efforts required by a programmer to enable GPU acceleration in the DAGuE framework. The performance achieved by this unchanged code, on a variety of heterogeneous and distributed many cores and GPU resources, demonstrates the desired performance portability.
AB - Performance portability is a major challenge faced today by developers on heterogeneous high performance computers, consisting of an interconnect, memory with non-uniform access, many-cores and accelerators like GPUs. Recent studies have successfully demonstrated that dense linear algebra operations can be efficiently handled by runtime systems using a DAG representation. In this work, we present the GPU subsystem of the DAGuE runtime, and assess, on the Cholesky factorization test case, the minimal efforts required by a programmer to enable GPU acceleration in the DAGuE framework. The performance achieved by this unchanged code, on a variety of heterogeneous and distributed many cores and GPU resources, demonstrates the desired performance portability.
KW - DAG scheduling
KW - GPU
KW - cluster
KW - linear algebra
UR - http://www.scopus.com/inward/record.url?scp=80955167923&partnerID=8YFLogxK
U2 - 10.1109/CLUSTER.2011.51
DO - 10.1109/CLUSTER.2011.51
M3 - Conference contribution
AN - SCOPUS:80955167923
SN - 9780769545165
T3 - Proceedings - IEEE International Conference on Cluster Computing, ICCC
SP - 395
EP - 402
BT - Proceedings - 2011 IEEE International Conference on Cluster Computing, CLUSTER 2011
T2 - 2011 IEEE International Conference on Cluster Computing, CLUSTER 2011
Y2 - 26 September 2011 through 30 September 2011
ER -