Machine learning models for GPU error prediction in a large scale HPC system

Bin Nie, Ji Xue, Saurabh Gupta, Tirthak Patel, Christian Engelmann, Evgenia Smirni, Devesh Tiwari

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

60 Scopus citations

Fingerprint

Dive into the research topics of 'Machine learning models for GPU error prediction in a large scale HPC system'. Together they form a unique fingerprint.

Computer Science

Engineering

Chemical Engineering