Abstract
Productivity from day one on supercomputers that leverage new technologies requires significant preparation. An institution that procures a novel system architecture often lacks sufficient institutional knowledge and skills to prepare for it. Thus, the "Center of Excellence" (CoE) concept has emerged to prepare for systems such as Summit and Sierra, currently the top two systems in the Top 500. This paper documents CoE experiences that prepared a workload of diverse applications and math libraries for a heterogeneous system. We describe our approach to this preparation, including our management and execution strategies, and detail our experiences with and reasons for using different programming approaches. Our early science and performance results show that the project enabled significant early seismic science with up to a l4X throughput increase over Cori. In addition to our successes, we discuss our challenges and failures so others may benefit from our experience.
| Original language | English |
|---|---|
| Title of host publication | Proceedings of SC 2019 |
| Subtitle of host publication | The International Conference for High Performance Computing, Networking, Storage and Analysis |
| Publisher | IEEE Computer Society |
| ISBN (Electronic) | 9781450362290 |
| DOIs | |
| State | Published - Nov 17 2019 |
| Event | 2019 International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2019 - Denver, United States Duration: Nov 17 2019 → Nov 22 2019 |
Publication series
| Name | International Conference for High Performance Computing, Networking, Storage and Analysis, SC |
|---|---|
| ISSN (Print) | 2167-4329 |
| ISSN (Electronic) | 2167-4337 |
Conference
| Conference | 2019 International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2019 |
|---|---|
| Country/Territory | United States |
| City | Denver |
| Period | 11/17/19 → 11/22/19 |
Funding
Prepared by LLNL under Contract DE-AC52-07NA27344. LLNL-CONF-772139. IBM and NVIDIA participation was supported under CORAL NRE Contract B604142.
Keywords
- GPUs
- Heterogeneous systems
- Large-scale applications
- Performance
- Project management
- programming models