TY - GEN
T1 - Bob Jenkins Lookup3 Hash Function on OpenCL FPGA Platform
AU - Jin, Zheming
AU - Finkel, Hal
N1 - Publisher Copyright:
© 2018 IEEE.
PY - 2018/7/2
Y1 - 2018/7/2
N2 - Field-programmable gate array (FPGA) is a promising choice as a heterogeneous computing component for energy-aware and high-performance applications. Emerging high-level synthesis (HLS) tools such as Intel FPGA Software Development Kit for Open Computing Language (OpenCL) offer a streamlined design flow to facilitate the use of FPGAs for scientists and researchers. In this paper, we focus on the optimizations of the OpenCL design of the Bob Jenkins lookup3 hash function which is used in the open source software version of Memcached. We describe in details the optimizations of the kernel on the FPGA, and evaluate the resource utilizations, performance, and performance per watt of the kernel implementations on an Arria10-based FPGA platform. The experimental results show that the optimized design can achieve 3.46X speedup in kernel execution time compared to the baseline implementation on the Nallatech 385A FPGA card that features an Arria 10 GX 1150 FPGA chip. For the performance per watt, we achieve 8 MHash/watt on the Arria 10 FPGA, which is 14X and 1.2X improvement over an Intel Xeon E5 CPU and an Nvidia K80 GPU, respectively.
AB - Field-programmable gate array (FPGA) is a promising choice as a heterogeneous computing component for energy-aware and high-performance applications. Emerging high-level synthesis (HLS) tools such as Intel FPGA Software Development Kit for Open Computing Language (OpenCL) offer a streamlined design flow to facilitate the use of FPGAs for scientists and researchers. In this paper, we focus on the optimizations of the OpenCL design of the Bob Jenkins lookup3 hash function which is used in the open source software version of Memcached. We describe in details the optimizations of the kernel on the FPGA, and evaluate the resource utilizations, performance, and performance per watt of the kernel implementations on an Arria10-based FPGA platform. The experimental results show that the optimized design can achieve 3.46X speedup in kernel execution time compared to the baseline implementation on the Nallatech 385A FPGA card that features an Arria 10 GX 1150 FPGA chip. For the performance per watt, we achieve 8 MHash/watt on the Arria 10 FPGA, which is 14X and 1.2X improvement over an Intel Xeon E5 CPU and an Nvidia K80 GPU, respectively.
KW - Bob Jenkins hash function
KW - FPGA
KW - OpenCL
UR - http://www.scopus.com/inward/record.url?scp=85062627672&partnerID=8YFLogxK
U2 - 10.1109/BigData.2018.8621960
DO - 10.1109/BigData.2018.8621960
M3 - Conference contribution
AN - SCOPUS:85062627672
T3 - Proceedings - 2018 IEEE International Conference on Big Data, Big Data 2018
SP - 4736
EP - 4741
BT - Proceedings - 2018 IEEE International Conference on Big Data, Big Data 2018
A2 - Abe, Naoki
A2 - Liu, Huan
A2 - Pu, Calton
A2 - Hu, Xiaohua
A2 - Ahmed, Nesreen
A2 - Qiao, Mu
A2 - Song, Yang
A2 - Kossmann, Donald
A2 - Liu, Bing
A2 - Lee, Kisung
A2 - Tang, Jiliang
A2 - He, Jingrui
A2 - Saltz, Jeffrey
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2018 IEEE International Conference on Big Data, Big Data 2018
Y2 - 10 December 2018 through 13 December 2018
ER -