TY - GEN
T1 - A novel hybrid CPU-GPU generalized eigensolver for electronic structure calculations based on fine grained memory aware tasks
AU - Solcà, Raffaele
AU - Haidar, Azzam
AU - Tomov, Stanimire
AU - Schulthess, Thomas C.
AU - Dongarra, Jack
PY - 2012
Y1 - 2012
N2 - The adoption of hybrid GPU-CPU nodes in traditional supercomputing platforms such as the Cray-XK6 opens acceleration opportunities for electronic structure calculations in materials science and chemistry applications, where medium-sized generalized eigenvalue problems must be solved many times. These eigenvalue problems are too small to effectively solve on distributed systems, but can benefit from the massive compute performance concentrated on a single node, hybrid GPU-CPU system. However, hybrid systems call for the development of new algorithms that efficiently exploit heterogeneity and massive parallelism of not just GPUs, but of multi/many-core CPUs as well. Addressing these demands, we developed a novel algorithm featuring innovative: Fine grained memory aware tasks, Hybrid execution/scheduling, and Increased computational intensity}. The resulting eigensolvers are state-of - The-art in HPC, significantly outperforming existing libraries. We describe the algorithm and analyze its performance impact on applications of interest when different fractions of eigenvectors are needed by the host electronic structure code.
AB - The adoption of hybrid GPU-CPU nodes in traditional supercomputing platforms such as the Cray-XK6 opens acceleration opportunities for electronic structure calculations in materials science and chemistry applications, where medium-sized generalized eigenvalue problems must be solved many times. These eigenvalue problems are too small to effectively solve on distributed systems, but can benefit from the massive compute performance concentrated on a single node, hybrid GPU-CPU system. However, hybrid systems call for the development of new algorithms that efficiently exploit heterogeneity and massive parallelism of not just GPUs, but of multi/many-core CPUs as well. Addressing these demands, we developed a novel algorithm featuring innovative: Fine grained memory aware tasks, Hybrid execution/scheduling, and Increased computational intensity}. The resulting eigensolvers are state-of - The-art in HPC, significantly outperforming existing libraries. We describe the algorithm and analyze its performance impact on applications of interest when different fractions of eigenvectors are needed by the host electronic structure code.
KW - 2-stage algorithm
KW - GPU
KW - eigenvalue and eigenvectors computation
KW - generalized eigenvalue problem
KW - hybrid computing
UR - http://www.scopus.com/inward/record.url?scp=84876578579&partnerID=8YFLogxK
U2 - 10.1109/SC.Companion.2012.173
DO - 10.1109/SC.Companion.2012.173
M3 - Conference contribution
AN - SCOPUS:84876578579
SN - 9780769549569
T3 - Proceedings - 2012 SC Companion: High Performance Computing, Networking Storage and Analysis, SCC 2012
SP - 1338
EP - 1340
BT - Proceedings - 2012 SC Companion
T2 - 2012 SC Companion: High Performance Computing, Networking Storage and Analysis, SCC 2012
Y2 - 10 November 2012 through 16 November 2012
ER -