TY - GEN
T1 - Exploiting latent I/O asynchrony in petascale science applications
AU - Widener, Patrick
AU - Payne, Mary
AU - Bridges, Patrick
AU - Wolf, Matthew
AU - Abbasi, Hasan
AU - McManus, Scott
AU - Schwan, Karsten
PY - 2009
Y1 - 2009
N2 - We present a collection of techniques for exploiting latent I/O asynchrony which can substantially improve performance in data-intensive parallel applications. Latent asynchrony refers to an application's tolerance for decoupling ancillary operations from its core computation, and is a property of HPC codes not fully explored by current HPC I/O systems. Decoupling operations such as buffering and staging, reorganization, and format conversion in space and in time from core codes can shorten I/O phases, preserving valuable MPP compute cycles. We describe in this paper DataTaps, IOgraphs, and Metabots, three tools which allow HPC developers to implement decoupled I/O operations. Using these tools, asynchrony can be exploited by data generators which overlap computation with communication, and by data consumers that perform data conversion and reorganization out-of-band and on-demand. In the context of a data-intensive fusion simulation, we show that exploiting latent asynchrony through decoupling of operations can provide significant performance benefits.
AB - We present a collection of techniques for exploiting latent I/O asynchrony which can substantially improve performance in data-intensive parallel applications. Latent asynchrony refers to an application's tolerance for decoupling ancillary operations from its core computation, and is a property of HPC codes not fully explored by current HPC I/O systems. Decoupling operations such as buffering and staging, reorganization, and format conversion in space and in time from core codes can shorten I/O phases, preserving valuable MPP compute cycles. We describe in this paper DataTaps, IOgraphs, and Metabots, three tools which allow HPC developers to implement decoupled I/O operations. Using these tools, asynchrony can be exploited by data generators which overlap computation with communication, and by data consumers that perform data conversion and reorganization out-of-band and on-demand. In the context of a data-intensive fusion simulation, we show that exploiting latent asynchrony through decoupling of operations can provide significant performance benefits.
UR - http://www.scopus.com/inward/record.url?scp=77949511374&partnerID=8YFLogxK
U2 - 10.1109/ICPPW.2009.67
DO - 10.1109/ICPPW.2009.67
M3 - Conference contribution
AN - SCOPUS:77949511374
SN - 9780769538037
T3 - Proceedings of the International Conference on Parallel Processing Workshops
SP - 105
EP - 112
BT - ICPPW 2009 - The 38th International Conference Parallel Processing Workshops
T2 - 38th International Conference Parallel Processing Workshops, ICPPW 2009
Y2 - 22 September 2009 through 25 September 2009
ER -