Robustness of Deep Learning Classification to Adversarial Input on GPUs: Asynchronous Parallel Accumulation Is a Source of Vulnerability

Sanjif Shanmugavelu, Mathieu Taillefumier, Christopher Culver, Vijay Ganesh, Oscar Hernandez, Ada Sedova

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

The ability of machine learning (ML) classification models to resist small, targeted input perturbations—known as adversarial attacks—is a key measure of their safety and reliability. We show that floating-point non associativity (FPNA) coupled with asynchronous parallel programming on GPUs is sufficient to result in misclassification, without any perturbation to the input. Additionally, we show that this misclassification is particularly significant for inputs close to the decision boundary and that standard adversarial robustness results may be overestimated up to 4.6 when not considering machine-level details. We first study a linear classifier, before focusing on standard Graph Neural Network (GNN) architectures and datasets used in robustness assessments. We develop a novel black-box attack using Bayesian optimization to discover external workloads that can change the instruction scheduling which bias the output of reductions on GPUs and reliably lead to misclassification. Motivated by these results, we present a new learnable permutation (LP) gradient-based approach to learning floating-point operation orderings that lead to misclassifications. The LP approach provides a worst-case estimate in a computationally efficient manner, avoiding the need to run identical experiments tens of thousands of times over a potentially large set of possible GPU states or architectures. Finally, using instrumentation-based testing, we investigate parallel reduction ordering across different GPU architectures under external background workloads, when utilizing multi-GPU virtualization, and when applying power capping. Our results demonstrate that parallel reduction ordering varies significantly across architectures under the first two conditions, substantially increasing the search space required to fully test the effects of this parallel scheduler-based vulnerability. These results and the methods developed here can help to include machine-level considerations into adversarial robustness assessments, which can make a difference in safety and mission critical applications.

Original languageEnglish
Title of host publicationEuro-Par 2025
Subtitle of host publicationParallel Processing - 31st European Conference on Parallel and Distributed Processing, Proceedings
EditorsWolfgang E. Nagel, Diana Goehringer, Pedro C. Diniz
PublisherSpringer Science and Business Media Deutschland GmbH
Pages306-320
Number of pages15
ISBN (Print)9783031998560
DOIs
StatePublished - 2026
Event31st International Conference on Parallel and Distributed Computing, Euro-Par 2025 - Dresden, Germany
Duration: Aug 25 2025Aug 29 2025

Publication series

NameLecture Notes in Computer Science
Volume15901 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference31st International Conference on Parallel and Distributed Computing, Euro-Par 2025
Country/TerritoryGermany
CityDresden
Period08/25/2508/29/25

Funding

This work was supported in part by the ORNL AI LDRD Initiative, the Swiss Platform For Advanced Scientific Computing (PASC), and the Accelerated Data Analytics and Computing Institute (ADAC). It used resources of the OLCF, a DOE Office of Science User Facility [DE-AC05-00OR22725], and the Swiss National Supercomputing Centre. The authors thank Hayashi Akihiro and Pim Witlox for insightful discussions.

Fingerprint

Dive into the research topics of 'Robustness of Deep Learning Classification to Adversarial Input on GPUs: Asynchronous Parallel Accumulation Is a Source of Vulnerability'. Together they form a unique fingerprint.

Cite this