Skip to main navigation Skip to search Skip to main content

Performance and Portability with OpenCL for Throughput-Oriented HPC Workloads across Accelerators, Coprocessors, and Multicore Processors

  • Chongxiao Cao
  • , Mark Gates
  • , Azzam Haidar
  • , Piotr Luszczek
  • , Stanimire Tomov
  • , Ichitaro Yamazaki
  • , Jack Dongarra

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

6 Scopus citations

Abstract

Ever since accelerators and coprocessors became the mainstream hardware for throughput-oriented HPC workloads, various programming techniques have been proposed to increase productivity in terms of both the performance and ease-of-use. We evaluate these aspects of OpenCL on a number of hardware platforms for an important subset of dense linear algebra operations that are relevant to a wide range of scientific applications. Our findings indicate that OpenCL portability has improved since our previous publication and many new and surprising usage scenarios are possible that rival those available after decades of software development on the CPUs. The combined performance-portability metric, even though not promised by the OpenCL standard, reflects the need for tuning performance-critical operations during the porting process and we show how a large portion of the available efficiency is lost if the tuning is not done correctly.

Original languageEnglish
Title of host publicationProceedings of ScalA 2014
Subtitle of host publication5th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems - held in conjunction with SC 2014: The International Conference for High Performance Computing, Networking, Storage and Analysis
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages61-68
Number of pages8
ISBN (Electronic)9781479975624
DOIs
StatePublished - Nov 16 2014
Externally publishedYes
Event5th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems, ScalA 2014 Held in Conjunction with the International Conference for High Performance Computing, Networking, Storage and Analysis, SC 201 - New Orleans, United States
Duration: Nov 16 2014Nov 21 2014

Publication series

NameProceedings of ScalA 2014: 5th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems - held in conjunction with SC 2014: The International Conference for High Performance Computing, Networking, Storage and Analysis

Conference

Conference5th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems, ScalA 2014 Held in Conjunction with the International Conference for High Performance Computing, Networking, Storage and Analysis, SC 201
Country/TerritoryUnited States
CityNew Orleans
Period11/16/1411/21/14

Fingerprint

Dive into the research topics of 'Performance and Portability with OpenCL for Throughput-Oriented HPC Workloads across Accelerators, Coprocessors, and Multicore Processors'. Together they form a unique fingerprint.

Cite this