Abstract
Offloading computation from a CPU to a hardware accelerator is becoming a more common solution for improving performance because traditional gains enabled by Moore's law and Dennard scaling have slowed. GPUs are often used as hardware accelerators, but field-programmable gate arrays (FPGAs) are gaining traction. FPGAs are beneficial because they allow hardware specific to a particular application to be created. However, they are notoriously difficult to program. To this end, two of the main FPGA manufacturers, Intel and Xilinx, have created tools and frameworks that enable the use of higher level languages to design FPGA hardware. Although Xilinx kernels can be designed by using C/C++, both Intel and Xilinx support the use of OpenCL C to architect FPGA hardware. However, not much is known about the portability and performance between these two device families other than the fact that it is theoretically possible to synthesize a kernel meant for Intel to Xilinx and vice versa. In this work, we evaluate the portability and performance of Intel and Xilinx kernels. We use OpenCL C implementations of a subset of the Rodinia benchmarking suite that were designed for an Intel FPGA and make the necessary modifications to create synthesizable OpenCL C kernels for a Xilinx FPGA. We find that the difficulty of porting certain kernel optimizations varies, depending on the construct. Once the minimum amount of modifications is made to create synthesizable hardware for the Xilinx platform, more nontrivial work is needed to improve performance. However, we find that constructs that are known to be performant for an FPGA should improve performance regardless of the platform; the difficulty comes in deciding how to invoke certain kernel optimizations while also abiding by the constraints enforced by a given platform's hardware compiler.
Original language | English |
---|---|
Title of host publication | International Workshop on OpenCL, IWOCL 2021 |
Publisher | Association for Computing Machinery |
ISBN (Electronic) | 9781450390330 |
DOIs | |
State | Published - Apr 27 2021 |
Event | 2021 International Workshop on OpenCL, IWOCL 2021 - Virtual, Online, Germany Duration: Apr 27 2021 → Apr 29 2021 |
Publication series
Name | ACM International Conference Proceeding Series |
---|
Conference
Conference | 2021 International Workshop on OpenCL, IWOCL 2021 |
---|---|
Country/Territory | Germany |
City | Virtual, Online |
Period | 04/27/21 → 04/29/21 |
Funding
The authors would like to acknowledge the ORNL Experimental Computing Laboratory team for its support with the compute resources and the software stack. We would also like to acknowledge Elizabeth Kirby of ORNL for her helpful suggestions and edits to this work. This research was supported in part by the following sources: National Science Foundation (NSF) under grant CNS-1763503, Defense Advanced Research Projects Agency (DARPA) Microsystems Technology Office (MTO) Domain-Specific System-on-Chip Program, and the US Department of Energy (DOE) Advanced Scientific Computing Research (ASCR) program. This manuscript has been co-authored by UT-Battelle, LLC under Contract No. DE-AC05-00OR22725 with the U.S. Department of Energy. The United States Government retains and the publisher, by accepting the article for publication, acknowledges that the United States Government retains a non-exclusive, paid-up, irrevocable, worldwide license to publish or reproduce the published form of this manuscript, or allow others to do so, for United States Government purposes. The Department of Energy will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan.
Keywords
- FPGA
- Rodinia
- Xilinx
- hardware accelerator
- high level synthesis
- performance
- portability