Benchmarking Operators in Deep Neural Networks for Improving Performance Portability of SYCL

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

SYCL is a portable programming model for heterogeneous computing, so it is important to obtain reasonable performance portability of SYCL. Towards the goal of better understanding and improving performance portability of SYCL for machine learning workloads, we have been developing benchmarks for basic operators in deep neural networks (DNNs). These operators could be offloaded to heterogeneous computing devices such as graphics processing units (GPUs) to speed up computation. In this paper, we introduce the benchmarks, evaluate the performance of the operators on GPU-based systems, and describe the causes of the performance gap between the SYCL and Compute Unified Device Architecture (CUDA) kernels. We find that the causes are related to the utilization of the texture cache for read-only data, optimization of the memory accesses with strength reduction, use of local memory, and register usage per thread. We hope that the efforts of developing benchmarks for studying performance portability will stimulate discussion and interactions within the community.

Original languageEnglish
Title of host publicationLanguages and Compilers for Parallel Computing - 36th International Workshop, LCPC 2023, Revised Selected Papers
EditorsHenry Dietz
PublisherSpringer Science and Business Media Deutschland GmbH
Pages33-45
Number of pages13
ISBN (Print)9783032024350
DOIs
StatePublished - 2026
Event36th International Workshop on Languages and Compilers for Parallel Computing, LCPC 2023 - Lexington, United States
Duration: Oct 11 2023Oct 13 2023

Publication series

NameLecture Notes in Computer Science
Volume14480 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference36th International Workshop on Languages and Compilers for Parallel Computing, LCPC 2023
Country/TerritoryUnited States
CityLexington
Period10/11/2310/13/23

Funding

We appreciate the reviewers’ comments and suggestions. This research used resources of the Experimental Computing Laboratory at the Oak Ridge National Laboratory. This manuscript has been authored by UT-Battelle LLC under contract no. DE-AC05-00OR22725 with the US Department of Energy. The publisher, by accepting the article for publication, acknowledges that the US government retains a non-exclusive, paid up, irrevocable, world-wide license to publish or reproduce the published form of the manuscript, or allow others to do so, for US government purposes. The DOE will provide public access to these results in accordance with the DOE Public Access Plan.

Keywords

  • Benchmarks
  • DNN operators
  • Performance Portability

Fingerprint

Dive into the research topics of 'Benchmarking Operators in Deep Neural Networks for Improving Performance Portability of SYCL'. Together they form a unique fingerprint.

Cite this