TY - GEN
T1 - In-depth optimization with the OpenACC-to-FPGA framework on an arria 10 FPGA
AU - Lambert, Jacob
AU - Lee, Seyong
AU - Vetter, Jeffrey S.
AU - Malony, Allen
N1 - Publisher Copyright:
© 2020 IEEE.
PY - 2020/5
Y1 - 2020/5
N2 - The reconfigurable computing paradigm that uses field programmable gate arrays (FPGAs) has received renewed interest in the high-performance computing field due to FPGAs' unique combination of performance and energy efficiency. However, difficulties in programming and optimizing FPGAs have prevented them from being widely accepted as general-purpose computing devices. In accelerator-based heterogeneous computing, portability across diverse heterogeneous devices is also an important issue, but the unique architectural features in FPGAs make this difficult to achieve. To address these issues, a directive-based, high-level FPGA programming and optimization framework was previously developed. In this work, developed optimizations were combined holistically using the directive-based approach to show that each individual benchmark requires a unique set of optimizations to maximize performance. The relationships between FPGA resource usages and runtime performance were also explored.
AB - The reconfigurable computing paradigm that uses field programmable gate arrays (FPGAs) has received renewed interest in the high-performance computing field due to FPGAs' unique combination of performance and energy efficiency. However, difficulties in programming and optimizing FPGAs have prevented them from being widely accepted as general-purpose computing devices. In accelerator-based heterogeneous computing, portability across diverse heterogeneous devices is also an important issue, but the unique architectural features in FPGAs make this difficult to achieve. To address these issues, a directive-based, high-level FPGA programming and optimization framework was previously developed. In this work, developed optimizations were combined holistically using the directive-based approach to show that each individual benchmark requires a unique set of optimizations to maximize performance. The relationships between FPGA resource usages and runtime performance were also explored.
KW - Compiler optimization
KW - Directive-based programming
KW - FPGA
KW - OpenACC
KW - OpenARC
UR - http://www.scopus.com/inward/record.url?scp=85091584803&partnerID=8YFLogxK
U2 - 10.1109/IPDPSW50202.2020.00084
DO - 10.1109/IPDPSW50202.2020.00084
M3 - Conference contribution
AN - SCOPUS:85091584803
T3 - Proceedings - 2020 IEEE 34th International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2020
SP - 460
EP - 470
BT - Proceedings - 2020 IEEE 34th International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2020
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 34th IEEE International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2020
Y2 - 18 May 2020 through 22 May 2020
ER -