TY - GEN
T1 - Investigating operating system noise in extreme-scale high-performance computing systems using simulation
AU - Engelmann, Christian
PY - 2013
Y1 - 2013
N2 - Hardware/software co-design for future-generation highperformance computing (HPC) systems aims at closing the gap between the peak capabilities of the hardware and the performance realized by applications (applicationarchitecture performance gap). Performance profiling of architectures and applications is a crucial part of this iterative process. The work in this paper focuses on operating system (OS) noise as an additional factor to be considered for co-design. It represents the first step in including OS noise in HPC hardware/software co-design by adding a noise injection feature to an existing simulation-based co-design toolkit. It reuses an existing abstraction for OS noise with frequency (periodic recurrence) and period (duration of each occurrence) to enhance the processor model of the Extreme-scale Simulator (xSim) with synchronized and random OS noise simulation. The results demonstrate this capability by evaluating the impact of OS noise on MPI Bcast() and MPI Reduce() in a simulated futuregeneration HPC system with 2,097,152 compute nodes.
AB - Hardware/software co-design for future-generation highperformance computing (HPC) systems aims at closing the gap between the peak capabilities of the hardware and the performance realized by applications (applicationarchitecture performance gap). Performance profiling of architectures and applications is a crucial part of this iterative process. The work in this paper focuses on operating system (OS) noise as an additional factor to be considered for co-design. It represents the first step in including OS noise in HPC hardware/software co-design by adding a noise injection feature to an existing simulation-based co-design toolkit. It reuses an existing abstraction for OS noise with frequency (periodic recurrence) and period (duration of each occurrence) to enhance the processor model of the Extreme-scale Simulator (xSim) with synchronized and random OS noise simulation. The results demonstrate this capability by evaluating the impact of OS noise on MPI Bcast() and MPI Reduce() in a simulated futuregeneration HPC system with 2,097,152 compute nodes.
KW - High-performance computing
KW - Operating system noise
KW - Parallel discrete event simulation
KW - Performance evaluation
UR - http://www.scopus.com/inward/record.url?scp=84875540947&partnerID=8YFLogxK
U2 - 10.2316/P.2013.795-010
DO - 10.2316/P.2013.795-010
M3 - Conference contribution
AN - SCOPUS:84875540947
SN - 9780889869431
T3 - IASTED Multiconferences - Proceedings of the IASTED International Conference on Parallel and Distributed Computing and Networks, PDCN 2013
SP - 670
EP - 677
BT - IASTED Multiconferences - Proceedings of the IASTED International Conference on Parallel and Distributed Computing and Networks, PDCN 2013
T2 - 11th IASTED International Conference on Parallel and Distributed Computing and Networks, PDCN 2013
Y2 - 11 February 2013 through 13 February 2013
ER -