TY - GEN
T1 - Data integration tasks on heterogeneous systems using OpenCL
AU - Faber, Clayton J.
AU - Cabrera, Anthony M.
AU - Booker, Orondé
AU - Maayan, Gabe
AU - Chamberlain, Roger D.
N1 - Publisher Copyright:
© 2019 Copyright held by the owner/author(s).
PY - 2019/5/13
Y1 - 2019/5/13
N2 - In the era of big data, many new algorithms are developed to try and find the most efficient way to perform computations with massive amounts of data. However, what is often overlooked is the preprocessing step for many of these applications. The Data Integration Benchmark Suite (DIBS) [1] was designed to understand the characteristics of dataset transformations in a hardware agnostic way. While on the surface these applications have a high amount of data parallelism, there are caveats in their specification that can potentially affect this characteristic. Even still, OpenCL can be an effective deployment environment for these applications. In this work we take a subset of the data transformations from each category presented in DIBS and implement them in OpenCL to evaluate their performance for heterogeneous systems. For targeting heterogeneous systems, we take a common application and attempt to deploy it to three platforms targetable by OpenCL (CPU, GPU, and FPGA). The applications are evaluated by their average transformation data rate (see Figure 1). We illustrate the advantages of each compute device in the data integration space along with different communications schemes allowed for host/device communication in the OpenCL platform.
AB - In the era of big data, many new algorithms are developed to try and find the most efficient way to perform computations with massive amounts of data. However, what is often overlooked is the preprocessing step for many of these applications. The Data Integration Benchmark Suite (DIBS) [1] was designed to understand the characteristics of dataset transformations in a hardware agnostic way. While on the surface these applications have a high amount of data parallelism, there are caveats in their specification that can potentially affect this characteristic. Even still, OpenCL can be an effective deployment environment for these applications. In this work we take a subset of the data transformations from each category presented in DIBS and implement them in OpenCL to evaluate their performance for heterogeneous systems. For targeting heterogeneous systems, we take a common application and attempt to deploy it to three platforms targetable by OpenCL (CPU, GPU, and FPGA). The applications are evaluated by their average transformation data rate (see Figure 1). We illustrate the advantages of each compute device in the data integration space along with different communications schemes allowed for host/device communication in the OpenCL platform.
UR - http://www.scopus.com/inward/record.url?scp=85069177413&partnerID=8YFLogxK
U2 - 10.1145/3318170.3318187
DO - 10.1145/3318170.3318187
M3 - Conference contribution
AN - SCOPUS:85069177413
T3 - ACM International Conference Proceeding Series
BT - Proceedings of the International Workshop on OpenCL, IWOCL 2019
PB - Association for Computing Machinery
T2 - 2019 International Workshop on OpenCL, IWOCL 2019
Y2 - 13 May 2019 through 15 May 2019
ER -