Evaluating and Optimizing OpenCL Base64 Data Unpacking Kernel with FPGA

Zheming Jin, Iris Johnson, Hal Finkel

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Scopus citations

Abstract

Development of applications using OpenCL targeting FPGAs is an emerging approach on heterogeneous computing systems. This paper uses the data unpacking algorithm in Base64 encoding as a case study to present programming and optimization techniques, and experimental results of the OpenCL-based implementations on an FPGA. We explain the algorithm and evaluate the performance of the kernel implementations with Intel's FPGA OpenCL SDK. The experimental results show kernel vectorization and duplication are two optimization techniques that can improve the kernel performance. The performance of kernel duplication is also closely related to the local work size. Our experiment shows 16-lane vectorization increases the bandwidth by a factor of 2 to 10 for large input data sizes. Moreover, the performance of kernel duplication using 16 compute units is 40% to 1.5% less than that of kernel vectorization depending on the input size. Tuning the local work size can improve the kernel performance by a factor of 3 to 23. For this kernel, using local memory is not an effective technique to improve the kernel performance because input data is not reused. A combination of vectorization and duplication achieves the highest performance of 12.3 GiB/s. Compared to an Intel Xeon E5 CPU and an Nvidia Tesla K80 GPU, the performance of the kernel on the Arria 10 FPGA is 6.7X faster than the CPU and 3X slower than the GPU. The performance per watt on the FPGA is 20.5X higher than the CPU and 1.19X lower than the GPU.

Original languageEnglish
Title of host publicationProceedings - 26th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing, PDP 2018
EditorsIgor Kotenko, Ivan Merelli, Pietro Lio
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages273-277
Number of pages5
ISBN (Electronic)9781538649756
DOIs
StatePublished - Jun 6 2018
Externally publishedYes
Event26th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing, PDP 2018 - Cambridge, United Kingdom
Duration: Mar 21 2018Mar 23 2018

Publication series

NameProceedings - 26th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing, PDP 2018

Conference

Conference26th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing, PDP 2018
Country/TerritoryUnited Kingdom
CityCambridge
Period03/21/1803/23/18

Keywords

  • Base64 Encoding
  • FPGA
  • OpenCL

Fingerprint

Dive into the research topics of 'Evaluating and Optimizing OpenCL Base64 Data Unpacking Kernel with FPGA'. Together they form a unique fingerprint.

Cite this