A software ecosystem for multi-level provenance management in large-scale scientific workflows for AI applications

  • G. Padovani
  • , V. Anantharaj
  • , L. Sacco
  • , T. Kurihana
  • , M. Bunino
  • , K. Tsolaki
  • , M. Girone
  • , F. Antonio
  • , C. Sopranzetti
  • , M. Fronza
  • , S. Fiore

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

4 Scopus citations

Abstract

Scientific workflows and provenance are two faces of the same medal. While the former addresses the coordinated execution of multiple tasks over a set of computational resources, the latter relates to the historical record of data from its original sources. This paper highlights the importance of tracking multi-level provenance metadata in complex, AI-based scientific workflows as a way to (i) foster and (ii) expand documentation of experiments, (iii) enable reproducibility, (iv) address interpretability of the results, (v) facilitate performance bottlenecks diagnosis, and (vi) advance provenance exploration and analysis opportunities.

Original languageEnglish
Title of host publicationProceedings of SC 2024-W
Subtitle of host publicationWorkshops of the International Conference for High Performance Computing, Networking, Storage and Analysis
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages2024-2031
Number of pages8
ISBN (Electronic)9798350355543
DOIs
StatePublished - 2024
Event2024 Workshops of the International Conference for High Performance Computing, Networking, Storage and Analysis, SC Workshops 2024 - Atlanta, United States
Duration: Nov 17 2024Nov 22 2024

Publication series

NameProceedings of SC 2024-W: Workshops of the International Conference for High Performance Computing, Networking, Storage and Analysis

Conference

Conference2024 Workshops of the International Conference for High Performance Computing, Networking, Storage and Analysis, SC Workshops 2024
Country/TerritoryUnited States
CityAtlanta
Period11/17/2411/22/24

Funding

This work was partially funded by the EU HE interTwin project (GA 101058386) and the EU HE Climateurope2 project (GA 101056933). Moreover, this work was partially funded under the NRRP, Mission 4 Component 2 Investment 1.4, by the European Union - NextGenerationEU (proj. nr. CN 00000013). This research used resources of the Oak Ridge Leadership Computing Facility at the Oak Ridge National Laboratory, which is supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC05- 00OR22725.

Keywords

  • ML tasks
  • Provenance
  • multi-level
  • workflow

Fingerprint

Dive into the research topics of 'A software ecosystem for multi-level provenance management in large-scale scientific workflows for AI applications'. Together they form a unique fingerprint.

Cite this