Falcon: On-line monitoring for steering parallel programs

Weiming Gu, Greg Eisenhauer, Karsten Schwan, Jeffrey Vetter

Research output: Contribution to journalArticlepeer-review

48 Scopus citations

Abstract

Advances in high performance computing, communications and user interfaces enable developers to construct increasingly interactive high performance applications. The Falcon system presented in this paper supports such interactivity by providing runtime libraries, tools and user interfaces that permit the on-line monitoring and steering of large-scale parallel codes. The principal aspects of Falcon described in this paper are its abstractions and tools for capture and analysis of application-specific program information, performed on-line, with controlled latencies and scalable to parallel machines of substantial size. In addition, Falcon provides support for the on-line graphical display of monitoring information, and it allows programs to be steered during their execution, by human users or algorithmically. This paper presents our basic research motivation, outlines the Falcon system's functionality, and includes a detailed evaluation of its performance characteristics in light of its principal contributions. Falcon's functionality and performance evaluation are driven by our experiences with large-scale parallel applications being developed with end users in physics and in atmospheric sciences. The sample application highlighted in this paper is a molecular dynamics simulation program (MD) used by physicists to study the statistical mechanics of liquids.

Original languageEnglish
Pages (from-to)699-736
Number of pages38
JournalConcurrency Practice and Experience
Volume10
Issue number9
DOIs
StatePublished - Aug 10 1998
Externally publishedYes

Fingerprint

Dive into the research topics of 'Falcon: On-line monitoring for steering parallel programs'. Together they form a unique fingerprint.

Cite this