Abstract
Global Address Space (GAS) programming models are attractive because they retain the easy-to-use addressing model that is the characteristic of shared-memory style load and store operations. The scalability of GAS models depends directly on the design and implementation of runtime libraries on the targeted platforms. In this paper, we examine the memory requirement of a popular GAS run-time library, Aggregate Remote Memory Copy Interface (ARMCI) on petascale Cray XT 5 systems. Then we describe a new technique cooperative server clustering that enhances the memory scalability of ARMCI communication servers. In cooperative server clustering, ARMCI servers are organized into clusters, and cooperatively process incoming communication requests among them. A request intervention scheme is also designed to expedite the return of responses to the initiating processes. Our experimental results demonstrate that, with very little impact on ARMCI communication latency and bandwidth, cooperative server clustering is able to significantly reduce the memory requirement of ARMCI communication servers, thereby enabling highly scalable scientific applications. In particular, it dramatically reduces the total execution time of a scientific application, NWChem, by 45% on 2400 processes.
Original language | English |
---|---|
Pages (from-to) | 57-64 |
Number of pages | 8 |
Journal | Computer Science - Research and Development |
Volume | 25 |
Issue number | 1-2 |
DOIs | |
State | Published - 2010 |
Funding
This work was funded in part by a UT-Battelle grant (UT-B-4000087151) to Auburn University, and in part by National Center for Computational Sciences. This research used resources of the National Center for Computational Sciences at Oak Ridge National Laboratory, which is supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC05-00OR22725. This research was also supported by an allocation of advanced computing resources provided by the National Science Foundation. Part of the computations were performed on Kraken (a Cray XT5) at the National Institute for Computational Sciences ( http://www.nics.tennessee.edu/ ).
Funders | Funder number |
---|---|
National Center for Computational Sciences | |
National Science Foundation | 1059376 |
U.S. Department of Energy | DE-AC05-00OR22725 |
Office of Science | |
Auburn University | |
UT-Battelle | UT-B-4000087151 |
Keywords
- ARMCI
- Cray XT5
- PGAS