Improving Energy Saving of One-Sided Matrix Decompositions on CPU-GPU Heterogeneous Systems

Jieyang Chen, Xin Liang, Kai Zhao, Hadi Zamani Sabzi, Laxmi Bhuyan, Zizhong Chen

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

2 Scopus citations

Abstract

One-sided dense matrix decompositions (e.g., Cholesky, LU, and QR) are the key components in scientific computing in many different fields. Although their design has been highly optimized for modern processors, they still consume a considerable amount of energy. As CPU-GPU heterogeneous systems are commonly used for matrix decompositions, in this work, we aim to further improve the energy saving of onesided matrix decompositions on CPU-GPU heterogeneous systems. We first build an Algorithm-Based Fault Tolerance protected overclocking technique (ABFT-OC) to enable us to exploit reliable overclocking for key matrix decomposition operations. Then, we design an energy-saving matrix decomposition framework, Bi-directional Slack Reclamation (BSR), that can intelligently combine the capability provided by ABFT-OC and DVFS to maximize energy saving and maintain performance and reliability. Experiments show that BSR is able to save up to 11.7% more energy compared with the current best energy saving optimization approach with no performance degradation and up to 14.1% Energy×Delay2 reduction. Also, BSR enables the Pareto efficient performance-energy trade-off, which is able to provide up to 1.43× performance improvement without costing extra energy.

Original languageEnglish
Title of host publicationPPoPP 2023 - Proceedings of the 28th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming
PublisherAssociation for Computing Machinery
Pages274-287
Number of pages14
ISBN (Electronic)9798400700156
DOIs
StatePublished - Feb 25 2023
Externally publishedYes
Event28th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming, PPoPP 2023 - Montreal, Canada
Duration: Feb 25 2023Mar 1 2023

Publication series

NameProceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPOPP

Conference

Conference28th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming, PPoPP 2023
Country/TerritoryCanada
CityMontreal
Period02/25/2303/1/23

Funding

This work was supported by the U.S. Department of Energy, Office of Science, Office of Advanced Scientific Computing Research, Scientific Discovery through the Advanced Computing (SciDAC) program under Award Number DESC0022209. The research was also partly supported by NSF Grant 1907401.

FundersFunder number
National Science Foundation1907401
U.S. Department of Energy
Office of Science
Advanced Scientific Computing ResearchDESC0022209

    Keywords

    • GPU
    • energy saving
    • fault tolerance
    • matrix decomposition

    Fingerprint

    Dive into the research topics of 'Improving Energy Saving of One-Sided Matrix Decompositions on CPU-GPU Heterogeneous Systems'. Together they form a unique fingerprint.

    Cite this