Reducing the amount of out-of-core data access for GPU-accelerated randomized SVD

Yuechao Lu, Ichitaro Yamazaki, Fumihiko Ino, Yasuyuki Matsushita, Stanimire Tomov, Jack Dongarra

Research output: Contribution to journalArticlepeer-review

9 Scopus citations

Abstract

We propose two acceleration methods, namely, Fused and Gram, for reducing out-of-core data access when performing randomized singular value decomposition (RSVD) on graphics processing units (GPUs). Out-of-core data here are data that are too large to fit into the GPU memory at once. Both methods accelerate GPU-enabled RSVD using the following three schemes: (1) a highly tuned general matrix-matrix multiplication (GEMM) scheme for processing out-of-core data on GPUs; (2) a data-access reduction scheme based on one-dimensional data partition; and (3) a first-in, first-out scheme that reduces CPU-GPU data transfer using the reverse iteration. The Fused method further reduces the amount of out-of-core data access by merging two GEMM operations into a single operation. By contrast, the Gram method reduces both in-core and out-of-core data access by explicitly forming the Gram matrix. According to our experimental results, the Fused and Gram methods improved the RSVD performance up to 1.7× and 5.2×, respectively, compared with a straightforward method that deploys schemes (1) and (2) on the GPU. In addition, we present a case study of deploying the Gram method for accelerating robust principal component analysis, a convex optimization problem in machine learning.

Original languageEnglish
Article numbere5754
JournalConcurrency and Computation: Practice and Experience
Volume32
Issue number19
DOIs
StatePublished - Oct 10 2020
Externally publishedYes

Funding

This research was in part supported by “Program for Leading Graduate Schools” of the Ministry of Education, Culture, Sports, Science and Technology, Japan and the Japan Society for the Promotion of Science KAKENHI grant numbers 15H01687 and 16H02801. Finally, the authors thank the reviewers for their valuable comments.

FundersFunder number
Japan Society for the Promotion of Science16H02801, 15H01687
Ministry of Education, Culture, Sports, Science and Technology

    Keywords

    • GPU
    • divide and conquer
    • out-of-core computation
    • singular value decomposition

    Fingerprint

    Dive into the research topics of 'Reducing the amount of out-of-core data access for GPU-accelerated randomized SVD'. Together they form a unique fingerprint.

    Cite this