Enabling event tracing at leadership-class scale through I/O forwarding middleware

Thomas Ilsche, Joseph Schuchart, Jason Cope, Dries Kimpe, Terry Jones, Andreas Knüpfer, Kamil Iskra, Robert Ross, Wolfgang E. Nagel, Stephen Poole

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

14 Scopus citations

Abstract

Event tracing is an important tool for understanding the performance of parallel applications. As concurrency increases in leadership-class computing systems, the quantity of performance log data can overload the parallel file system, perturbing the application being observed. In this work we present a solution for event tracing at leadership scales. We enhance the I/O forwarding system software to aggregate and reorganize log data prior to writing to the storage system, significantly reducing the burden on the underlying file system for this type of traffic. Furthermore, we augment the I/O forwarding system with a write buffering capability to limit the impact of artificial perturbations from log data accesses on traced applications. To validate the approach, we modify the Vampir tracing toolset to take advantage of this new capability and show that the approach increases the maximum traced application size by a factor of 5× to more than 200,000 processes.

Original languageEnglish
Title of host publicationHPDC '12 - Proceedings of the 21st ACM Symposium on High-Performance Parallel and Distributed Computing
Pages49-60
Number of pages12
DOIs
StatePublished - 2012
Event21st ACM Symposium on High-Performance Parallel and Distributed Computing, HPDC '12 - Delft, Netherlands
Duration: Jun 18 2012Jun 22 2012

Publication series

NameHPDC '12 - Proceedings of the 21st ACM Symposium on High-Performance Parallel and Distributed Computing

Conference

Conference21st ACM Symposium on High-Performance Parallel and Distributed Computing, HPDC '12
Country/TerritoryNetherlands
CityDelft
Period06/18/1206/22/12

Keywords

  • Atomic append
  • Event tracing
  • I/O forwarding

Fingerprint

Dive into the research topics of 'Enabling event tracing at leadership-class scale through I/O forwarding middleware'. Together they form a unique fingerprint.

Cite this