CUDA grid-level task progression algorithms

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Tasking is a prominent parallel programming model. In this paper we conduct a first study into the feasibility of task-parallel execution at the CUDA grid, rather than the stream/kernel level, for regular, fixed in-out dependency task graphs, similar to those found in wavefront computational patterns, making the findings broadly applicable. We propose and evaluate three CUDA task progression algorithms, where threadblocks cooperatively process the task graph, and argue about their performance in terms of tasking throughput, atomics and memory IO overheads. Our initial results demonstrate a throughput of 38 million tasks/second on a Kepler K20X architecture.

Original languageEnglish
Title of host publicationProceedings - 2015 IEEE 17th International Conference on High Performance Computing and Communications, 2015 IEEE 7th International Symposium on Cyberspace Safety and Security and 2015 IEEE 12th International Conference on Embedded Software and Systems, HPCC-CSS-ICESS 2015
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages1628-1632
Number of pages5
ISBN (Electronic)9781479989362
DOIs
StatePublished - Nov 23 2015
Event17th IEEE International Conference on High Performance Computing and Communications, IEEE 7th International Symposium on Cyberspace Safety and Security and IEEE 12th International Conference on Embedded Software and Systems, HPCC-ICESS-CSS 2015 - New York, United States
Duration: Aug 24 2015Aug 26 2015

Publication series

NameProceedings - 2015 IEEE 17th International Conference on High Performance Computing and Communications, 2015 IEEE 7th International Symposium on Cyberspace Safety and Security and 2015 IEEE 12th International Conference on Embedded Software and Systems, HPCC-CSS-ICESS 2015

Conference

Conference17th IEEE International Conference on High Performance Computing and Communications, IEEE 7th International Symposium on Cyberspace Safety and Security and IEEE 12th International Conference on Embedded Software and Systems, HPCC-ICESS-CSS 2015
Country/TerritoryUnited States
CityNew York
Period08/24/1508/26/15

Keywords

  • Computational modeling
  • Graphics processing units
  • Instruction sets
  • Kernel
  • Parallel processing
  • Radiation detectors
  • Runtime

Fingerprint

Dive into the research topics of 'CUDA grid-level task progression algorithms'. Together they form a unique fingerprint.

Cite this