TY - GEN
T1 - Evaluation of MD5Hash Kernel on OpenCL FPGA platform
AU - Jin, Zheming
AU - Finkel, Hal
N1 - Publisher Copyright:
© 2018 IEEE.
PY - 2018/8/3
Y1 - 2018/8/3
N2 - Field-Programmable Gate Array (FPGA) is a promising choice as a heterogeneous computing component for energy-aware high-performance computing applications. The emerging high-level synthesis (HLS) tools such as Intel OpenCL SDK provide a streamlined design flow to facilitate the use of FPGAs. In this paper, we are focused on the evaluation of the MD5 hashing kernel in the scalable heterogeneous computing (SHOC) bench suite. We explain the kernel and then evaluate the resource usage, performance, and performance per watt of the kernel implementations on the FPGA. The experimental results show that the optimized implementation using kernel duplication can achieve a factor of 6.1 X speedup over the naïve code on the Nallatech 385A FPGA card that features an Arria 10 FPGA chip. For the performance per watt, we achieve 43 million hashes per watt on an Intel Arria 10 GX1150 FPGA, which is 39.7X, 15.6X, and 1.07X improvement over a dual-socket Intel Xeon E5-2687W processor, an Intel Xeon Phi Knights Landing 7210 processor, and an Nvidia K80 GPU, respectively.
AB - Field-Programmable Gate Array (FPGA) is a promising choice as a heterogeneous computing component for energy-aware high-performance computing applications. The emerging high-level synthesis (HLS) tools such as Intel OpenCL SDK provide a streamlined design flow to facilitate the use of FPGAs. In this paper, we are focused on the evaluation of the MD5 hashing kernel in the scalable heterogeneous computing (SHOC) bench suite. We explain the kernel and then evaluate the resource usage, performance, and performance per watt of the kernel implementations on the FPGA. The experimental results show that the optimized implementation using kernel duplication can achieve a factor of 6.1 X speedup over the naïve code on the Nallatech 385A FPGA card that features an Arria 10 FPGA chip. For the performance per watt, we achieve 43 million hashes per watt on an Intel Arria 10 GX1150 FPGA, which is 39.7X, 15.6X, and 1.07X improvement over a dual-socket Intel Xeon E5-2687W processor, an Intel Xeon Phi Knights Landing 7210 processor, and an Nvidia K80 GPU, respectively.
KW - FPGA
KW - MD5 Hashing
KW - OpenCL
UR - http://www.scopus.com/inward/record.url?scp=85052206017&partnerID=8YFLogxK
U2 - 10.1109/IPDPSW.2018.00157
DO - 10.1109/IPDPSW.2018.00157
M3 - Conference contribution
AN - SCOPUS:85052206017
SN - 9781538655559
T3 - Proceedings - 2018 IEEE 32nd International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2018
SP - 1026
EP - 1032
BT - Proceedings - 2018 IEEE 32nd International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2018
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 32nd IEEE International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2018
Y2 - 21 May 2018 through 25 May 2018
ER -