Low-Order Finite Element Solver with Small Matrix-Matrix Multiplication Accelerated by AI-Specific Hardware for Crustal Deformation Computation

Takuma Yamaguchi, Kohei Fujita, Tsuyoshi Ichimura, Akira Naruse, Jack C. Wells, Christopher J. Zimmer, Tjerk P. Straatsma, Muneo Hori, Lalith Maddegedara, Naonori Ueda

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

5 Scopus citations

Abstract

This study proposes a fast low-order finite element solver for crustal deformation computations by applying Tensor Core, AI-specific hardware on a Volta GPU. Tensor Core can compute large matrix-matrix multiplications rapidly in half precision. We redesign a state-of-the-art solver algorithm so that lower-precision data types can be used and memory access costs can be reduced even when we use small matrices. With the proposed solver, we solved 13 billion degrees-of-freedom two-layered problems that mimicked the Earth's crust and mantle using 36 compute nodes of Summit. In the matrix-vector kernel, we obtained a 4.1-fold speedup over a standard kernel in a single-precision format. Our proposed solver increased the FLOP count of the entire solver; however, we reduced the time-to-solution by 1.7-fold since the Tensor Core provided a high effective performance.

Original languageEnglish
Title of host publicationProceedings of the Platform for Advanced Scientific Computing Conference, PASC 2020
PublisherAssociation for Computing Machinery
ISBN (Electronic)9781450379939
DOIs
StatePublished - Jun 29 2020
Event7th Annual Platform for Advanced Scientific Computing Conference, PASC 2020 - Geneva, Switzerland
Duration: Jun 29 2020Jul 1 2020

Publication series

NameProceedings of the Platform for Advanced Scientific Computing Conference, PASC 2020

Conference

Conference7th Annual Platform for Advanced Scientific Computing Conference, PASC 2020
Country/TerritorySwitzerland
CityGeneva
Period06/29/2007/1/20

Funding

Our results were obtained using the Summit at Oak Ridge Leadership Computing Facility, a US Department of Energy, Office of Science User Facility at Oak Ridge National Laboratory (ORNL). We thank Yukihiko Hirano (NVIDIA) for coordination of the collaborative research project. We thank Christopher B. Fuson, Don E. Maxwell, Oscar Hernandez, Scott Atchley, Veronica Melesse-Vergara (ORNL), Jeff Larkin, Stephen Abbott (NVIDIA), Lixiang Luo (IBM), Richard Graham (Mellanox Technologies) for generous support concerning use of Summit. We thank Noda Tomoyuki and Hikaru Inoue (Fujitsu Limited) for support in program development. We acknowledge support from Japan Society for the Promotion of Science (18H05239 and 18K18873).

Keywords

  • Conjugate gradient method
  • Finite element analysis
  • GPU computation
  • Transprecision computing

Fingerprint

Dive into the research topics of 'Low-Order Finite Element Solver with Small Matrix-Matrix Multiplication Accelerated by AI-Specific Hardware for Crustal Deformation Computation'. Together they form a unique fingerprint.

Cite this