Workflow Anomaly Detection with Graph Neural Networks

Hongwei Jin, Krishnan Raghavan, George Papadimitriou, Cong Wang, Anirban Mandal, Patrycja Krawczuk, Loic Pottier, Mariam Kiran, Ewa Deelman, Prasanna Balaprakash

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

5 Scopus citations

Abstract

Reliable execution of scientific workflows is a fundamental concern in computational campaigns. Therefore, detecting and diagnosing anomalies are both important and challenging for workflow executions that span complex, distributed computing infrastructures. In this paper we model the scientific workflow as a directed acyclic graph and apply graph neural networks (GNNs) to identify the anomalies at both the workflow and individual job levels. In addition, we generalize our GNN model to take into account a set of workflows together for the anomaly detection task rather than a specific workflow. By taking advantage of learning the hidden representation, not only from the job features, but also from the topological information of the workflow, our GNN models demonstrate higher accuracy and better runtime efficiency when compared with conventional machine learning models and other convolutional neural network approaches.

Original languageEnglish
Title of host publicationProceedings of WORKS 2022
Subtitle of host publication17th Workshop on Workflows in Support of Large-Scale Science, Held in conjunction with SC 2022: The International Conference for High Performance Computing, Networking, Storage and Analysis
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages35-42
Number of pages8
ISBN (Electronic)9781665451918
DOIs
StatePublished - 2022
Externally publishedYes
Event17th IEEE/ACM Workshop on Workflows in Support of Large-Scale Science, WORKS 2022 - Dallas, United States
Duration: Nov 13 2022Nov 18 2022

Publication series

NameProceedings of WORKS 2022: 17th Workshop on Workflows in Support of Large-Scale Science, Held in conjunction with SC 2022: The International Conference for High Performance Computing, Networking, Storage and Analysis

Conference

Conference17th IEEE/ACM Workshop on Workflows in Support of Large-Scale Science, WORKS 2022
Country/TerritoryUnited States
CityDallas
Period11/13/2211/18/22

Funding

This work is funded by the Department of Energy under the Integrated Computational and Data Infrastructure (ICDI) for Scientific Discovery, grant #DE-SC0022328. Experimental data was collected on the ExoGENI testbed supported by NSF. This material is based upon work supported by the U.S. Department of Energy, Office of Science, under contract number DE-AC02-06CH11357.

Keywords

  • Anomaly detection
  • Graph neural networks
  • Scientific workflows

Fingerprint

Dive into the research topics of 'Workflow Anomaly Detection with Graph Neural Networks'. Together they form a unique fingerprint.

Cite this