Exploiting Scratchpad Memory for Deep Temporal Blocking A case study for 2D Jacobian 5-point iterative stencil kernel (j2d5pt)

Lingqi Zhang, Mohamed Wahib, Peng Chen, Jintao Meng, Xiao Wang, Toshio Endo, Satoshi Matsuoka

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

General Purpose Graphics Processing Units (GPGPU) are used in most of the top systems in HPC. The total capacity of scratchpad memory has increased by more than 40 times in the last decade. However, existing optimizations for stencil computations using temporal blocking have not aggressively exploited the large capacity of scratchpad memory. This work uses the 2D Jacobian 5-point iterative stencil as a case study to investigate the use of large scratchpad memory. Unlike existing research that tiles the domain in a thread block fashion, we tile the domain so that each tile is large enough to utilize all available scratchpad memory on the GPU. Consequently, we process several time steps inside a single tile before offloading the result back to global memory. Our evaluation shows that our performance is comparable to state-of-the-art implementations, yet our implementation is much simpler and does not require auto-generation of code.

Original languageEnglish
Title of host publicationProceedings of the 15th Workshop on General Purpose Processing Using GPU, GPGPU 2023
PublisherAssociation for Computing Machinery
Pages34-35
Number of pages2
ISBN (Electronic)9798400707766
DOIs
StatePublished - Feb 25 2023
Event15th Annual Workshop on General Purpose Processing using Graphics Processing Unit, GPGPU 2023 - Montreal, Canada
Duration: Feb 25 2023 → …

Publication series

NameACM International Conference Proceeding Series

Conference

Conference15th Annual Workshop on General Purpose Processing using Graphics Processing Unit, GPGPU 2023
Country/TerritoryCanada
CityMontreal
Period02/25/23 → …

Funding

This work was supported by JSPS KAKENHI under Grant Number JP21K17750. This paper is based on results obtained from a project, JPNP20006, commissioned by the New Energy and Industrial Technology Development Organization (NEDO). This research used resources at the Oak Ridge Leadership Computing Facility, a DOE Office of Science User Facility operated by the Oak Ridge National Laboratory. The authors wish to express their sincere appreciation to Jens Domke, Aleksandr Drozd, Emil Vatai and other RIKEN R-CCS colleagues for their invaluable advice and guidance throughout the course of this research. Finally, the first author would also like to express his gratitude to RIKEN R-CCS for offering the opportunity to undertake this research in an intern program.

Keywords

  • GPGPU
  • Iterative Stencil Solvers
  • Temporal Blocking

Fingerprint

Dive into the research topics of 'Exploiting Scratchpad Memory for Deep Temporal Blocking A case study for 2D Jacobian 5-point iterative stencil kernel (j2d5pt)'. Together they form a unique fingerprint.

Cite this