TY - GEN
T1 - CUDA grid-level task progression algorithms
AU - Kartsaklis, Christos
AU - Joubert, Wayne
AU - Hernandez, Oscar R.
AU - Eisenbach, Markus
AU - Elwasif, Wael R.
AU - Bernholdt, David E.
N1 - Publisher Copyright:
© 2015 IEEE.
PY - 2015/11/23
Y1 - 2015/11/23
N2 - Tasking is a prominent parallel programming model. In this paper we conduct a first study into the feasibility of task-parallel execution at the CUDA grid, rather than the stream/kernel level, for regular, fixed in-out dependency task graphs, similar to those found in wavefront computational patterns, making the findings broadly applicable. We propose and evaluate three CUDA task progression algorithms, where threadblocks cooperatively process the task graph, and argue about their performance in terms of tasking throughput, atomics and memory IO overheads. Our initial results demonstrate a throughput of 38 million tasks/second on a Kepler K20X architecture.
AB - Tasking is a prominent parallel programming model. In this paper we conduct a first study into the feasibility of task-parallel execution at the CUDA grid, rather than the stream/kernel level, for regular, fixed in-out dependency task graphs, similar to those found in wavefront computational patterns, making the findings broadly applicable. We propose and evaluate three CUDA task progression algorithms, where threadblocks cooperatively process the task graph, and argue about their performance in terms of tasking throughput, atomics and memory IO overheads. Our initial results demonstrate a throughput of 38 million tasks/second on a Kepler K20X architecture.
KW - Computational modeling
KW - Graphics processing units
KW - Instruction sets
KW - Kernel
KW - Parallel processing
KW - Radiation detectors
KW - Runtime
UR - http://www.scopus.com/inward/record.url?scp=84961752403&partnerID=8YFLogxK
U2 - 10.1109/HPCC-CSS-ICESS.2015.53
DO - 10.1109/HPCC-CSS-ICESS.2015.53
M3 - Conference contribution
AN - SCOPUS:84961752403
T3 - Proceedings - 2015 IEEE 17th International Conference on High Performance Computing and Communications, 2015 IEEE 7th International Symposium on Cyberspace Safety and Security and 2015 IEEE 12th International Conference on Embedded Software and Systems, HPCC-CSS-ICESS 2015
SP - 1628
EP - 1632
BT - Proceedings - 2015 IEEE 17th International Conference on High Performance Computing and Communications, 2015 IEEE 7th International Symposium on Cyberspace Safety and Security and 2015 IEEE 12th International Conference on Embedded Software and Systems, HPCC-CSS-ICESS 2015
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 17th IEEE International Conference on High Performance Computing and Communications, IEEE 7th International Symposium on Cyberspace Safety and Security and IEEE 12th International Conference on Embedded Software and Systems, HPCC-ICESS-CSS 2015
Y2 - 24 August 2015 through 26 August 2015
ER -