Optimizing an atomics-based reduction Kernel on OpenCL FPGA platform

Zheming Jin, Hal Finkel

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

7 Scopus citations

Abstract

Field-programmable gate arrays (FPGAs) are becoming one of heterogeneous computing components in highperformance computing. To facilitate the use of FPGAs for developers and researchers, high-level synthesis tools are pushing the FPGA-based design abstraction from the register-transfer level to high-level language design flow using OpenCL/C/C++. Currently, there are few studies on the atomic functions in the OpenCL-based design flow on an FPGA. In this paper, we evaluate the performance of atomic functions using a reduction kernel on an OpenCL FPGA platform as a case study. We describe the implementations of an integer sum-reduction kernel in OpenCL, and perform the optimizations of memory accesses. Fully utilizing the bandwidth of the data bus can bring a factor of 15 improvement over the baseline kernel. The performance speedup of the kernel using local memory for atomic operations is 6.8X over the naïve kernel using global memory. The combination of both optimizations can lead to 112X speedup. Compute unit duplication can be applied to the kernel to further improve the performance by a factor of 2.9.

Original languageEnglish
Title of host publicationProceedings - 2018 IEEE 32nd International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2018
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages532-539
Number of pages8
ISBN (Print)9781538655559
DOIs
StatePublished - Aug 3 2018
Externally publishedYes
Event32nd IEEE International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2018 - Vancouver, Canada
Duration: May 21 2018May 25 2018

Publication series

NameProceedings - 2018 IEEE 32nd International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2018

Conference

Conference32nd IEEE International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2018
Country/TerritoryCanada
CityVancouver
Period05/21/1805/25/18

Keywords

  • Atomics
  • FPGA
  • OpenCL
  • Reductions

Fingerprint

Dive into the research topics of 'Optimizing an atomics-based reduction Kernel on OpenCL FPGA platform'. Together they form a unique fingerprint.

Cite this