TY - GEN
T1 - Performance analysis and acceleration of explicit integration for large kinetic networks using batched GPU computations
AU - Haidar, Azzam
AU - Brock, Benjamin
AU - Tomov, Stanimire
AU - Guidry, Michael
AU - Billings, Jay Jay
AU - Shyles, Daniel
AU - Dongarra, Jack
N1 - Publisher Copyright:
© 2016 IEEE.
PY - 2016/11/28
Y1 - 2016/11/28
N2 - We demonstrate the systematic implementation of recently-developed fast explicit kinetic integration algorithms that solve efficiently N coupled ordinary differential equations (subject to initial conditions) on modern GPUs. We take representative test cases (Type Ia supernova explosions) and demonstrate two or more orders of magnitude increase in efficiency for solving such systems (of realistic thermonuclear networks coupled to fluid dynamics). This implies that important coupled, multiphysics problems in various scientific and technical disciplines that were intractable, or could be simulated only with highly schematic kinetic networks, are now computationally feasible. As examples of such applications we present the computational techniques developed for our ongoing deployment of these new methods on modern GPU accelerators. We show that similarly to many other scientific applications, ranging from national security to medical advances, the computation can be split into many independent computational tasks, each of relatively small-size. As the size of each individual task does not provide sufficient parallelism for the underlying hardware, especially for accelerators, these tasks must be computed concurrently as a single routine, that we call batched routine, in order to saturate the hardware with enough work.
AB - We demonstrate the systematic implementation of recently-developed fast explicit kinetic integration algorithms that solve efficiently N coupled ordinary differential equations (subject to initial conditions) on modern GPUs. We take representative test cases (Type Ia supernova explosions) and demonstrate two or more orders of magnitude increase in efficiency for solving such systems (of realistic thermonuclear networks coupled to fluid dynamics). This implies that important coupled, multiphysics problems in various scientific and technical disciplines that were intractable, or could be simulated only with highly schematic kinetic networks, are now computationally feasible. As examples of such applications we present the computational techniques developed for our ongoing deployment of these new methods on modern GPU accelerators. We show that similarly to many other scientific applications, ranging from national security to medical advances, the computation can be split into many independent computational tasks, each of relatively small-size. As the size of each individual task does not provide sufficient parallelism for the underlying hardware, especially for accelerators, these tasks must be computed concurrently as a single routine, that we call batched routine, in order to saturate the hardware with enough work.
UR - http://www.scopus.com/inward/record.url?scp=85007048788&partnerID=8YFLogxK
U2 - 10.1109/HPEC.2016.7761605
DO - 10.1109/HPEC.2016.7761605
M3 - Conference contribution
AN - SCOPUS:85007048788
T3 - 2016 IEEE High Performance Extreme Computing Conference, HPEC 2016
BT - 2016 IEEE High Performance Extreme Computing Conference, HPEC 2016
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2016 IEEE High Performance Extreme Computing Conference, HPEC 2016
Y2 - 13 September 2016 through 15 September 2016
ER -