Evaluating the Performance of Integer Sum Reduction in SYCL on GPUs

Zheming Jin, Jeffrey Vetter

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

2 Scopus citations

Abstract

SYCL is a promising programming model for heterogeneous computing - allowing a single-source code to target devices from multiple vendors. One significant task performed on these accelerators is a primitive operation for integer sum reduction. This paper presents several SYCL implementations of integer sum reduction - using atomic functions, shared local memory, vectorized memory accesses and parameterized workload sizes - to compare the performance and maturity of SYCL against open-source vendor-specific implementations of the same reduction. For a sufficiently large number of integers, tuning the parameters of our SYCL implementations achieves 1.4X speedup over the open-source implementations on an Intel UHD630 integrated GPU. The SYCL reduction is 3% faster than the templated reduction in Thrust, and 0.3% faster than the device reduction in CUB on an Nvidia P100 GPU. The SYCL reduction is 1.9% faster than the templated reduction in Thrust, and 0.4% faster than the device reduction in CUB on an Nvidia V100 GPU.

Original languageEnglish
Title of host publication50th International Conference on Parallel Processing Workshop, ICPP 2021 - Proceedings
PublisherAssociation for Computing Machinery
ISBN (Electronic)9781450384414
DOIs
StatePublished - Aug 9 2021
Event50th International Conference on Parallel Processing Workshop, ICPP 2021 - Virtual, Online, United States
Duration: Aug 9 2021Aug 12 2021

Publication series

NameACM International Conference Proceeding Series

Conference

Conference50th International Conference on Parallel Processing Workshop, ICPP 2021
Country/TerritoryUnited States
CityVirtual, Online
Period08/9/2108/12/21

Funding

We appreciate the reviewers for their comments and criticisms. This research used resources of the Experimental Computing Laboratory at Oak Ridge National Laboratory, which is supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC05-00OR22725. The research also used resources of the Intel DevCloud.

FundersFunder number
U.S. Department of EnergyDE-AC05-00OR22725
Office of Science

    Keywords

    • Cuda
    • Gpgpu
    • Opencl
    • Reduction
    • Sycl

    Fingerprint

    Dive into the research topics of 'Evaluating the Performance of Integer Sum Reduction in SYCL on GPUs'. Together they form a unique fingerprint.

    Cite this