TY - GEN
T1 - Evaluating Burst Buffer Placement in HPC Systems
AU - Khetawat, Harsh
AU - Zimmer, Christopher
AU - Mueller, Frank
AU - Atchley, Scott
AU - Vazhkudai, Sudharshan S.
AU - Mubarak, Misbah
N1 - Publisher Copyright:
© 2019 IEEE.
PY - 2019/9
Y1 - 2019/9
N2 - Burst buffers (BBs) are increasingly exploited in contemporary supercomputers to bridge the performance gap between compute and storage systems. The design of BBs, particularly the placement of these devices and the underlying network topology, impacts both performance and cost. As the cost of other components such as memory and accelerators is increasing, it is becoming more important that HPC centers provision BBs tailored to their workloads.This work contributes a provisioning system to provide accurate, multi-tenant simulations that model realistic application and storage workloads from HPC systems. The framework aids HPC centers in modeling their workloads against multiple network and BB configurations rapidly. In experiments with our framework, we provide a comparison of representative Oak Ridge Leadership Computing Facility (OLCF) I/O workloads against multiple BB designs. We analyze the impact of these designs on latency, I/O phase lengths, contention for network and storage devices, and choice of network topology.
AB - Burst buffers (BBs) are increasingly exploited in contemporary supercomputers to bridge the performance gap between compute and storage systems. The design of BBs, particularly the placement of these devices and the underlying network topology, impacts both performance and cost. As the cost of other components such as memory and accelerators is increasing, it is becoming more important that HPC centers provision BBs tailored to their workloads.This work contributes a provisioning system to provide accurate, multi-tenant simulations that model realistic application and storage workloads from HPC systems. The framework aids HPC centers in modeling their workloads against multiple network and BB configurations rapidly. In experiments with our framework, we provide a comparison of representative Oak Ridge Leadership Computing Facility (OLCF) I/O workloads against multiple BB designs. We analyze the impact of these designs on latency, I/O phase lengths, contention for network and storage devices, and choice of network topology.
KW - HPC
KW - I/O
KW - burst buffers
KW - simulation
UR - http://www.scopus.com/inward/record.url?scp=85075269959&partnerID=8YFLogxK
U2 - 10.1109/CLUSTER.2019.8891051
DO - 10.1109/CLUSTER.2019.8891051
M3 - Conference contribution
AN - SCOPUS:85075269959
T3 - Proceedings - IEEE International Conference on Cluster Computing, ICCC
BT - Proceedings - 2019 IEEE International Conference on Cluster Computing, CLUSTER 2019
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2019 IEEE International Conference on Cluster Computing, CLUSTER 2019
Y2 - 23 September 2019 through 26 September 2019
ER -