On-line automated performance diagnosis on thousands of processes

Philip C. Roth, Barton P. Miller

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

33 Scopus citations

Abstract

Performance analysis tools are critical for the effective use of large parallel computing resources, but existing tools have failed to address three problems that limit their scalability: (1) management and processing of the volume of performance data generated when monitoring a large number of application processes, (2) communication between a large number of tool components, and (3) presentation of performance data and analysis results for applications with a large number of processes. In this paper, we present a novel approach for finding performance problems in applications with a large number of processes that leverages our multicast and data aggregation infrastructure to address these three performance tool scalability barriers. First, we show how to design a scalable, distributed performance diagnosis facility. We demonstrate this design with an on-line, automated strategy for finding performance bottlenecks. Our strategy uses distributed, independent bottleneck search agents located in the tool agent processes that monitor running application processes. Second, we present a technique for constructing compact displays of the results of our bottleneck detection strategy. This technique, called the Sub-Graph Folding Algorithm, presents bottleneck search results using dynamic graphs that record the refinement of a bottleneck search. The complexity of the results graph is controlled by combining sub-graphs showing similar local application behavior into a composite sub-graph. Using an approach that combines these two synergistic parts, we performed bottleneck searches on programs with up to 1024 processes with no sign of tool resource saturation. With 1024 application processes, our visualization technique reduced a search results graph containing over 30,000 nodes to a single composite 44-node graph sub-graph showing the same qualitative performance information as the original graph.

Original languageEnglish
Title of host publicationProceedings of the 2006 ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPOPP'06
PublisherAssociation for Computing Machinery (ACM)
Pages69-80
Number of pages12
ISBN (Print)1595931899, 9781595931894
DOIs
StatePublished - 2006
Event2006 ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPOPP'06 - New York, NY, United States
Duration: Mar 29 2006Mar 31 2006

Publication series

NameProceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPOPP
Volume2006

Conference

Conference2006 ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPOPP'06
Country/TerritoryUnited States
CityNew York, NY
Period03/29/0603/31/06

Keywords

  • Automation
  • Paradyn
  • Performance diagnosis
  • Scalability
  • Tools

Fingerprint

Dive into the research topics of 'On-line automated performance diagnosis on thousands of processes'. Together they form a unique fingerprint.

Cite this