Abstract
The Rapid Alloy Method for Producing Accurate, General Empirical potential generation toolkit (RAMPAGE) is a program for fitting multicomponent interatomic potential functions for metal alloys. In this paper, we describe a collaborative effort between domain scientists and performance engineers to improve the parallelism, scalability, and maintainability of the code. We modified RAMPAGE to use the Message Passing Interface (MPI) for communication and synchronization, to use more than one MPI process when evaluating candidate potential functions, and to have its MPI processes execute functionality that was previously executed by external non-MPI processes. We ported RAMPAGE to run on the Eos and Titan Cray systems of the United States Department of Energy (DOE)'s Oak Ridge Leadership Computing Facility (OLCF), and the Cori and Edison systems at the DOE's National Energy Research Scientific Computing Center (NERSC). Our modifications resulted in a 7 speedup on 8 Eos system nodes, and scalability up to 2048 processes on the Cori system with Intel Knights Landing processors.
Original language | English |
---|---|
Title of host publication | SEPS 2017 - Proceedings of the 4th ACM SIGPLAN International Workshop on Software Engineering for Parallel Systems, co-located with SPLASH 2017 |
Editors | Ali Jannesari, Tim Mattson, Pablo de Oliveira Castro, Yukinori Sato |
Publisher | Association for Computing Machinery, Inc |
Pages | 11-20 |
Number of pages | 10 |
ISBN (Electronic) | 9781450355179 |
DOIs | |
State | Published - Oct 23 2017 |
Event | 4th ACM SIGPLAN International Workshop on Software Engineering for Parallel Systems, SEPS 2017 - Vancouver, Canada Duration: Oct 23 2017 → … |
Publication series
Name | SEPS 2017 - Proceedings of the 4th ACM SIGPLAN International Workshop on Software Engineering for Parallel Systems, co-located with SPLASH 2017 |
---|
Conference
Conference | 4th ACM SIGPLAN International Workshop on Software Engineering for Parallel Systems, SEPS 2017 |
---|---|
Country/Territory | Canada |
City | Vancouver |
Period | 10/23/17 → … |
Bibliographical note
Publisher Copyright:© 2017 Copyright held by the owner/author(s).
Funding
In this paper, we describe a recent collaboration between members of the Center for Performance and Design of Nuclear Waste Forms and Containers (WastePD) [3], an Energy Frontier Research Center (EFRC) supported by the United States Department of Energy (DOE) Office of Science, and members of the Institute for Sustained Performance, Energy, and Resilience (SUPER) project of the Office of Science’s Scientific Discovery through Advanced Computing (SciDAC) program. The Rapid Alloy Method for Producing Accurate, General Empirical generation toolkit (RAMPAGE) [10, 13, 14] is software that finds multicomponent interatomic potential functions for metal alloys. RAMPAGE is used to study the properties of metallic glasses and high-entropy alloys for use in nuclear waste containers. Our experience illustrates a collaborative approach to refactoring and re-engineering an existing application for improved parallelism, scalability, and maintainability. RAMPAGE uses a genetic algorithm approach in a master-worker organization, and its initial implementation could exploit multiple cores within a single system node. Thus a major focus for our collaboration was to enable RAMPAGE to use more than one system node, and to enable its underlying molecular (MD) dynamics simulations to use more than one process when evaluating candidate potential functions. We modified RAMPAGE to use the Message Passing Interface (MPI) [4, 5] for communication and synchronization, to support multiple MPI processes for each MD simulation, and to have its MPI processes execute functionality that was previously accomplished using externally-invoked, non-MPI processes. We ported the software to the OLCF’s Titan Cray XK7 with graphics processing units (GPUs), OLCF’s Eos Cray XC30 system, NERSC’s Cori Cray XC40 system with Intel Knights Landing manycore processors, and NERSC’s Edison Cray XC30 system. On the Eos system, our modifications resulted in a 7× speedup on 8 compute nodes. Based on our evaluation of these modifications, we developed several recommendations for future work to further improve RAMPAGE’s performance and scalability. The contributions from OSU were supported as part of the Center for Performance and Design of Nuclear Waste Forms and Containers, an Energy Frontier Research Center funded by the U.S. Department of Energy, Office of Science, Basic Energy Sciences under Award # DE-SC0016584. This material is based upon work supported by the U.S. Department of Energy, Office of Science, Office of Advanced Scientific Computing Research under contract number DE-AC05-00OR22725. This research used resources of the Oak Ridge Leadership Computing Facility at the Oak Ridge National Laboratory, which is supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC05-00OR22725. This manuscript has been co-authored by UT-Battelle, LLC, under contract DE-AC05-00OR22725 with the US Department of Energy (DOE). The US government retains and the publisher, by accepting the article for publication, acknowledges that the US government retains a nonexclusive, paid-up, irrevocable, worldwide license to publish or reproduce the published form of this manuscript, or allow others to do so, for US government purposes. DOE will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan (http://energy.gov/downloads/doe-public-access-plan). Researchers from LBNL were funded by the Advanced Scientific Computing Research Program in the U.S. Department of Energy, Office of Science, under Award Number DE-AC02-05CH11231. This research used resources of the National Energy Research Scientific Computing Center (NERSC), which is supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC02-05CH11231.
Funders | Funder number |
---|---|
Center for Performance and Design of Nuclear Waste Forms and Containers | |
Energy Frontier Research Center | |
WastePD | |
U.S. Department of Energy | |
Office of Science | DE-AC02-05CH11231 |
Basic Energy Sciences | DE-SC0016584 |
Advanced Scientific Computing Research | DE-AC05-00OR22725 |