Design and Analysis of Soft-Error Resilience Mechanisms for GPU Register File

Sparsh Mittal, Haonan Wang, Adwait Jog, Jeffrey S. Vetter

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

3 Scopus citations

Abstract

Modern graphics processing units (GPUs) are using increasingly larger register file (RF) which occupies a large fraction of GPU core area and is very frequently accessed. This makes RF vulnerable to soft-errors (SE). In this paper, we present two techniques for improving SE resilience of GPU RF. First, we propose compressing the RF values for reducing the number of vulnerable bits. We leverage value similarity and the presence of narrow-width values to perform compression at warp or thread-level, respectively. Second, we propose selective hardening to design a portion of register entry with SE immune circuits. By collectively using these techniques, higher resilience can be provided with lower overhead. Without hardening, our warp and thread-level compression techniques bring 47.0% and 40.8% reduction in SE vulnerability, respectively.

Original languageEnglish
Title of host publicationProceedings - 2017 30th International Conference on VLSI Design and 2017 16th International Conference on Embedded Systems, VLSID 2017
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages409-414
Number of pages6
ISBN (Electronic)9781509057405
DOIs
StatePublished - Mar 21 2017
Event30th International Conference on VLSI Design and 2017 16th International Conference on Embedded Systems, VLSID 2017 - Hyderabad, India
Duration: Jan 7 2017Jan 11 2017

Publication series

NameProceedings - 2017 30th International Conference on VLSI Design and 2017 16th International Conference on Embedded Systems, VLSID 2017

Conference

Conference30th International Conference on VLSI Design and 2017 16th International Conference on Embedded Systems, VLSID 2017
Country/TerritoryIndia
CityHyderabad
Period01/7/1701/11/17

Funding

FundersFunder number
National Science Foundation1657336

    Keywords

    • GPU
    • data compression
    • narrow-value detection
    • register file
    • soft-error resilience

    Fingerprint

    Dive into the research topics of 'Design and Analysis of Soft-Error Resilience Mechanisms for GPU Register File'. Together they form a unique fingerprint.

    Cite this