The Case for Co-Designing Model Architectures with Hardware

Quentin Anthony, Jacob Hatef, Deepak Narayanan, Stella Biderman, Stas Bekman, Junqi Yin, Aamir Shafi, Hari Subramoni, Dhabaleswar Panda

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

While GPUs are responsible for training the vast majority of state-of-the-art deep learning models, the implications of their architecture are often overlooked when designing new deep learning (DL) models. As a consequence, modifying a DL model to be more amenable to the target hardware can significantly improve the runtime performance of DL training and inference. In this paper, we provide a set of guidelines for users to maximize the runtime performance of their transformer models. These guidelines have been created by carefully considering the impact of various model hyperparameters controlling model shape on the efficiency of the underlying computation kernels executed on the GPU. We find the throughput of models with "efficient"model shapes is up to 39% higher while preserving accuracy compared to models with a similar number of parameters but with unoptimized shapes.

Original languageEnglish
Title of host publication53rd International Conference on Parallel Processing, ICPP 2024 - Main Conference Proceedings
PublisherAssociation for Computing Machinery
Pages84-96
Number of pages13
ISBN (Electronic)9798400708428
DOIs
StatePublished - Aug 12 2024
Event53rd International Conference on Parallel Processing, ICPP 2024 - Gotland, Sweden
Duration: Aug 12 2024Aug 15 2024

Publication series

NameACM International Conference Proceeding Series

Conference

Conference53rd International Conference on Parallel Processing, ICPP 2024
Country/TerritorySweden
CityGotland
Period08/12/2408/15/24

Funding

This research is supported in part by NSF grants #1818253, #1854828, #2007991, #2018627, #2311830, #2312927, and XRAC grant #NCR-130002.

FundersFunder number
XRAC-130002
National Science Foundation2007991, 2312927, 2018627, 2311830, 1854828, 1818253
National Science Foundation

    Fingerprint

    Dive into the research topics of 'The Case for Co-Designing Model Architectures with Hardware'. Together they form a unique fingerprint.

    Cite this