PSI/J: A Portable Interface for Submitting, Monitoring, and Managing Jobs

Mihael Hategan-Marandiuc, Andre Merzky, Nicholson Collier, Ketan Maheshwari, Jonathan Ozik, Matteo Turilli, Andreas Wilke, Justin M. Wozniak, Kyle Chard, Ian Foster, Rafael Ferreira Da Silva, Shantenu Jha, Daniel Laney

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

5 Scopus citations

Abstract

It is generally desirable for high-performance computing (HPC) applications to be portable between HPC systems, for example to make use of more performant hardware, make effective use of allocations, and to co-locate compute jobs with large datasets. Unfortunately, moving scientific applications between HPC systems is challenging for various reasons, most notably that HPC systems have different HPC schedulers. We introduce PSI/J, a job management abstraction API intended to simplify the construction of software components and applications that are portable over various HPC scheduler implementations. We argue that such a system is both necessary and that no viable alternative currently exists. We analyze similar notable APIs and attempt to determine the factors that influenced their evolution and adoption by the HPC community. We base the design of PSI/J on that analysis. We describe how PSI/J has been integrated in three workflow systems and one application, and also show via experiments that PSI/J imposes minimal overhead.

Original languageEnglish
Title of host publicationProceedings 2023 IEEE 19th International Conference on e-Science, e-Science 2023
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9798350322231
DOIs
StatePublished - 2023
Event19th IEEE International Conference on e-Science, e-Science 2023 - Limassol, Cyprus
Duration: Oct 9 2023Oct 14 2023

Publication series

NameProceedings 2023 IEEE 19th International Conference on e-Science, e-Science 2023

Conference

Conference19th IEEE International Conference on e-Science, e-Science 2023
Country/TerritoryCyprus
CityLimassol
Period10/9/2310/14/23

Funding

supported by the Office of Science of the U.S. DOE under Contract No. DE-AC05-00OR22725. OSPREY work was supported by the National Science Foundation under Grant No. 2200234, the National Institutes of Health under grant R01DA055502, and the DOE Office of Science through the Bio-preparedness Research Virtual Environment (BRaVE) initiative. This manuscript has been authored in part by UT-Battelle, LLC, under contract DE-AC05-00OR22725 with the US Department of Energy (DOE). The publisher, by accepting the article for publication, acknowledges that the U.S. Government retains a non-exclusive, paid up, irrevocable, worldwide license to publish or reproduce the published form of the manuscript, or allow others to do so, for U.S. Government purposes. The DOE will provide public access to these results in accordance with the DOE Public Access Plan (http://energy.gov/downloads/doe-public-access-plan). ACKNOWLEDGEMENTS This research was supported by the Exascale Computing Project (17-SC-20-SC), a collaborative effort of the U.S. Department of Energy Office of Science and the National Nuclear Security Administration. This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344 (LLNL-CONF-826133), Argonne National Laboratory under Contract DE-AC02-06CH11357, and Brookhaven National Laboratory under Contract DESC0012704. This research used resources of the OLCF at ORNL, which is

FundersFunder number
National Science Foundation2200234
National Institutes of HealthR01DA055502
U.S. Department of EnergyDE-AC05-00OR22725
Office of Science
National Nuclear Security Administration
Argonne National LaboratoryDE-AC02-06CH11357
Lawrence Livermore National LaboratoryLLNL-CONF-826133, DE-AC52-07NA27344
Brookhaven National LaboratoryDESC0012704

    Fingerprint

    Dive into the research topics of 'PSI/J: A Portable Interface for Submitting, Monitoring, and Managing Jobs'. Together they form a unique fingerprint.

    Cite this