Performance and scalability analysis of Cray X1 vectorization and multistreaming optimization

Sadaf Alam, Jeffrey Vetter

Research output: Contribution to journalConference articlepeer-review

Abstract

Cray X1 Fortran and C/C++ compilers provide a number of loop transformations, notably vectorization and multistreaming, in order to exploit the multistreaming processor (MSP) hardware resources and its high memory bandwidth. A Cray X1 node is composed of four MSPs, which in turn are composed of four single streaming processors (SSP). Each SSP contains a superscalar processing unit and two vector processing units. Compiler vectorization provides loop level parallelization and uses the vector processing hardware. Multistreaming code generation by the compiler permits execution across the SSPs of an MSP on a block of code. In this paper, we analyze overall impact of loop-level compiler optimization on a scientific application called Parallel Ocean Program (POP). POP has been extensively optimized for X1 by instrumenting the code using X1 compiler directives. We compare and contrast automatic and manual optimization schemes available on X1 and analyze their impact on the code performance and scalability. Our results show that the addition of compiler directives increases the average vector length, thereby improving the single node performance significantly. However, this code scales at a slower rate as the local workload volume decreases and the communication costs increase.

Original languageEnglish
Pages (from-to)304-312
Number of pages9
JournalLecture Notes in Computer Science
Volume3514
Issue numberI
DOIs
StatePublished - 2005
Event5th International Conference on Computational Science - ICCS 2005 - Atlanta, GA, United States
Duration: May 22 2005May 25 2005

Funding

This research was sponsored by the Office of Mathematical, Information, and Computational Sciences, Office of Science, U.S. Department of Energy under Contract No. DE-AC05-00OR22725 with UT-Battelle, LLC. Accordingly, the U.S. Government retains a nonexclusive, royalty-free license to publish or reproduce the published form of this contribution, or allow others to do so, for U.S. Government purposes.

FundersFunder number
U.S. Department of EnergyDE-AC05-00OR22725
Office of Science

    Fingerprint

    Dive into the research topics of 'Performance and scalability analysis of Cray X1 vectorization and multistreaming optimization'. Together they form a unique fingerprint.

    Cite this