TY - GEN
T1 - PSI/J
T2 - 19th IEEE International Conference on e-Science, e-Science 2023
AU - Hategan-Marandiuc, Mihael
AU - Merzky, Andre
AU - Collier, Nicholson
AU - Maheshwari, Ketan
AU - Ozik, Jonathan
AU - Turilli, Matteo
AU - Wilke, Andreas
AU - Wozniak, Justin M.
AU - Chard, Kyle
AU - Foster, Ian
AU - Da Silva, Rafael Ferreira
AU - Jha, Shantenu
AU - Laney, Daniel
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - It is generally desirable for high-performance computing (HPC) applications to be portable between HPC systems, for example to make use of more performant hardware, make effective use of allocations, and to co-locate compute jobs with large datasets. Unfortunately, moving scientific applications between HPC systems is challenging for various reasons, most notably that HPC systems have different HPC schedulers. We introduce PSI/J, a job management abstraction API intended to simplify the construction of software components and applications that are portable over various HPC scheduler implementations. We argue that such a system is both necessary and that no viable alternative currently exists. We analyze similar notable APIs and attempt to determine the factors that influenced their evolution and adoption by the HPC community. We base the design of PSI/J on that analysis. We describe how PSI/J has been integrated in three workflow systems and one application, and also show via experiments that PSI/J imposes minimal overhead.
AB - It is generally desirable for high-performance computing (HPC) applications to be portable between HPC systems, for example to make use of more performant hardware, make effective use of allocations, and to co-locate compute jobs with large datasets. Unfortunately, moving scientific applications between HPC systems is challenging for various reasons, most notably that HPC systems have different HPC schedulers. We introduce PSI/J, a job management abstraction API intended to simplify the construction of software components and applications that are portable over various HPC scheduler implementations. We argue that such a system is both necessary and that no viable alternative currently exists. We analyze similar notable APIs and attempt to determine the factors that influenced their evolution and adoption by the HPC community. We base the design of PSI/J on that analysis. We describe how PSI/J has been integrated in three workflow systems and one application, and also show via experiments that PSI/J imposes minimal overhead.
UR - http://www.scopus.com/inward/record.url?scp=85174248611&partnerID=8YFLogxK
U2 - 10.1109/e-Science58273.2023.10254912
DO - 10.1109/e-Science58273.2023.10254912
M3 - Conference contribution
AN - SCOPUS:85174248611
T3 - Proceedings 2023 IEEE 19th International Conference on e-Science, e-Science 2023
BT - Proceedings 2023 IEEE 19th International Conference on e-Science, e-Science 2023
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 9 October 2023 through 14 October 2023
ER -