TY - GEN
T1 - Power and performance tradeoff of a floating-point intensive Kernel on OpenCL FPGA platform
AU - Jin, Zheming
AU - Finkel, Hal
N1 - Publisher Copyright:
© 2018 IEEE.
PY - 2018/8/3
Y1 - 2018/8/3
N2 - As power is recognized as one of first-order constraints in high-performance computing, understanding how power and performance are affected using a high-level synthesis tool for a floating-point intensive kernel is important. FPGAs offer a promising solution for high-performance and energyefficient computing applications. This paper presents the impact of the optimizations of a floating-point intensive kernel from a geographical information system upon the performance and power using an FPGA. Using an OpenCL-based high-level synthesis tool for FPGAs, we evaluate the resource usage, performance and power consumption of the kernel implementations on an Arria10-based FPGA platform. We compare the performance and energy efficiency of the kernel implementations on an Arria 10 GX1150 FPGA, an Intel's Xeon Phi Knights Landing CPU, and an NVIDIA's Tesla K80 GPU. Our experiment shows that the performance per watt of the kernel implementation on the FPGA is 1.79X better than the CPU and 1.56X better than the GPU. The execution time on the FPGA is approximately 2.9X slower.
AB - As power is recognized as one of first-order constraints in high-performance computing, understanding how power and performance are affected using a high-level synthesis tool for a floating-point intensive kernel is important. FPGAs offer a promising solution for high-performance and energyefficient computing applications. This paper presents the impact of the optimizations of a floating-point intensive kernel from a geographical information system upon the performance and power using an FPGA. Using an OpenCL-based high-level synthesis tool for FPGAs, we evaluate the resource usage, performance and power consumption of the kernel implementations on an Arria10-based FPGA platform. We compare the performance and energy efficiency of the kernel implementations on an Arria 10 GX1150 FPGA, an Intel's Xeon Phi Knights Landing CPU, and an NVIDIA's Tesla K80 GPU. Our experiment shows that the performance per watt of the kernel implementation on the FPGA is 1.79X better than the CPU and 1.56X better than the GPU. The execution time on the FPGA is approximately 2.9X slower.
KW - FPGA
KW - Floating point intensive kernel
KW - OpenCL
KW - Performance and power tradeoff
UR - http://www.scopus.com/inward/record.url?scp=85052211685&partnerID=8YFLogxK
U2 - 10.1109/IPDPSW.2018.00115
DO - 10.1109/IPDPSW.2018.00115
M3 - Conference contribution
AN - SCOPUS:85052211685
SN - 9781538655559
T3 - Proceedings - 2018 IEEE 32nd International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2018
SP - 716
EP - 720
BT - Proceedings - 2018 IEEE 32nd International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2018
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 32nd IEEE International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2018
Y2 - 21 May 2018 through 25 May 2018
ER -