TY - GEN
T1 - A holistic approach for performance measurement and analysis for petascale applications
AU - Jagode, Heike
AU - Dongarra, Jack
AU - Alam, Sadaf
AU - Vetter, Jeffrey
AU - Spear, Wyatt
AU - Malony, Allen D.
PY - 2009
Y1 - 2009
N2 - Contemporary high-end Terascale and Petascale systems are composed of hundreds of thousands of commodity multi-core processors interconnected with high-speed custom networks. Performance characteristics of applications executing on these systems are a function of system hardware and software as well as workload parameters. Therefore, it has become increasingly challenging to measure, analyze and project performance using a single tool on these systems. In order to address these issues, we propose a methodology for performance measurement and analysis that is aware of applications and the underlying system hierarchies. On the application level, we measure cost distribution and runtime dependent values for different components of the underlying programming model. On the system front, we measure and analyze information gathered for unique system features, particularly shared components in the multi-core processors. We demonstrate our approach using a Petascale combustion application called S3D on two high-end Teraflops systems, Cray XT4 and IBM Blue Gene/P, using a combination of hardware performance monitoring, profiling and tracing tools.
AB - Contemporary high-end Terascale and Petascale systems are composed of hundreds of thousands of commodity multi-core processors interconnected with high-speed custom networks. Performance characteristics of applications executing on these systems are a function of system hardware and software as well as workload parameters. Therefore, it has become increasingly challenging to measure, analyze and project performance using a single tool on these systems. In order to address these issues, we propose a methodology for performance measurement and analysis that is aware of applications and the underlying system hierarchies. On the application level, we measure cost distribution and runtime dependent values for different components of the underlying programming model. On the system front, we measure and analyze information gathered for unique system features, particularly shared components in the multi-core processors. We demonstrate our approach using a Petascale combustion application called S3D on two high-end Teraflops systems, Cray XT4 and IBM Blue Gene/P, using a combination of hardware performance monitoring, profiling and tracing tools.
KW - Performance Analysis
KW - Performance Tools
KW - Petascale Applications
KW - Petascale Systems
KW - Profiling
KW - Trace files
KW - Tracing
UR - http://www.scopus.com/inward/record.url?scp=70149086071&partnerID=8YFLogxK
U2 - 10.1007/978-3-642-01973-9_77
DO - 10.1007/978-3-642-01973-9_77
M3 - Conference contribution
AN - SCOPUS:70149086071
SN - 3642019722
SN - 9783642019722
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 686
EP - 695
BT - Computational Science - ICCS 2009 - 9th International Conference, Proceedings
T2 - 9th International Conference on Computational Science, ICCS 2009
Y2 - 25 May 2009 through 27 May 2009
ER -