TY - GEN
T1 - I/O Bottleneck Detection and Tuning
T2 - 6th IEEE/ACM International Parallel Data Systems Workshop, PDSW 2021
AU - Bez, Jean Luca
AU - Tang, Houjun
AU - Xie, Bing
AU - Williams-Young, David
AU - Latham, Rob
AU - Ross, Rob
AU - Oral, Sarp
AU - Byna, Suren
N1 - Publisher Copyright:
© 2021 IEEE.
PY - 2021
Y1 - 2021
N2 - Using parallel file systems efficiently is a tricky problem due to inter-dependencies among multiple layers of I/O software, including high-level I/O libraries (HDF5, netCDF, etc.), MPI-IO, POSIX, and file systems (GPFS, Lustre, etc.). Profiling tools such as Darshan collect traces to help understand the I/O performance behavior. However, there are significant gaps in analyzing the collected traces and then applying tuning options offered by various layers of I/O software. Seeking to connect the dots between I/O bottleneck detection and tuning, we propose DXT Explorer, an interactive log analysis tool. In this paper, we present a case study using our interactive log analysis tool to identify and apply various I/O optimizations. We report an evaluation of performance improvement achieved for four I/O kernels extracted from science applications.
AB - Using parallel file systems efficiently is a tricky problem due to inter-dependencies among multiple layers of I/O software, including high-level I/O libraries (HDF5, netCDF, etc.), MPI-IO, POSIX, and file systems (GPFS, Lustre, etc.). Profiling tools such as Darshan collect traces to help understand the I/O performance behavior. However, there are significant gaps in analyzing the collected traces and then applying tuning options offered by various layers of I/O software. Seeking to connect the dots between I/O bottleneck detection and tuning, we propose DXT Explorer, an interactive log analysis tool. In this paper, we present a case study using our interactive log analysis tool to identify and apply various I/O optimizations. We report an evaluation of performance improvement achieved for four I/O kernels extracted from science applications.
UR - http://www.scopus.com/inward/record.url?scp=85124177075&partnerID=8YFLogxK
U2 - 10.1109/PDSW54622.2021.00008
DO - 10.1109/PDSW54622.2021.00008
M3 - Conference contribution
AN - SCOPUS:85124177075
T3 - Proceedings of PDSW 2021: IEEE/ACM 6th International Parallel Data Systems Workshop, Held in conjunction with SC 2021: The International Conference for High Performance Computing, Networking, Storage and Analysis
SP - 15
EP - 22
BT - Proceedings of PDSW 2021
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 15 November 2021
ER -