TY - GEN
T1 - Performance evaluation of the cray XI distributed shared memory architecture
AU - Dunigan, Thomas H.
AU - Vetter, Jeffrey S.
AU - Worley, Patrick H.
PY - 2004
Y1 - 2004
N2 - The Cray XI supercomputer is a distributed shared memory vector multiprocessor, scalable to 4096 processors and up to 65 terabytes of memory. The XI's hierarchical design uses the basic building block of the multi-streaming processor (MSP), which is capable of 12.8 GF/s for 64-bit operations. The distributed shared memory (DSM) of the XI presents a 64-bit global address space that is directly addressable from every MSP with an interconnect bandwidth per computation rate of one byte per floating point operation. Our results show that this high bandwidth and low latency for remote memory accesses translates into improved application performance on important applications, such as an Eulerian gyrokinetic-Maxwell solver. Furthermore, this architecture naturally supports programming models like the Cray shmem API, Unified Parallel C (UPC), and Co-Array FORTRAN (CAF), and it is imperative to select the appropriate models to exploit these features as our benchmarks demonstrate.
AB - The Cray XI supercomputer is a distributed shared memory vector multiprocessor, scalable to 4096 processors and up to 65 terabytes of memory. The XI's hierarchical design uses the basic building block of the multi-streaming processor (MSP), which is capable of 12.8 GF/s for 64-bit operations. The distributed shared memory (DSM) of the XI presents a 64-bit global address space that is directly addressable from every MSP with an interconnect bandwidth per computation rate of one byte per floating point operation. Our results show that this high bandwidth and low latency for remote memory accesses translates into improved application performance on important applications, such as an Eulerian gyrokinetic-Maxwell solver. Furthermore, this architecture naturally supports programming models like the Cray shmem API, Unified Parallel C (UPC), and Co-Array FORTRAN (CAF), and it is imperative to select the appropriate models to exploit these features as our benchmarks demonstrate.
UR - http://www.scopus.com/inward/record.url?scp=14844292730&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:14844292730
SN - 0780386868
SN - 9780780386860
T3 - Proceedings - 12th Annual IEEE Symposium on High Performance Interconnects, Hot Interconnects
SP - 20
EP - 25
BT - Proceedings - 12th Annual IEEE Symposium on High Performance Interconnects, Hot Interconnects
A2 - Watters, S.
T2 - Proceedings - 12th Annual IEEE Symposium on High Performance Interconnects, Hot Interconnects
Y2 - 25 August 2004 through 27 August 2004
ER -