Abstract
Accurate large-scale first principles calculations based on density functional theory (DFT) in metallic systems are prohibitively expensive due to the asymptotic cubic scaling computational complexity with number of electrons. Using algorithmic advances in employing finite-element discretization for DFT (DFT-FE) in conjunction with efficient computational methodologies and mixed precision strategies, we delay the onset of this cubic scaling by significantly reducing the computational prefactor while increasing the arithmetic intensity and lowering the data movement costs. This has enabled fast, accurate and massively parallel DFT calculations on large-scale metallic systems on both many-core and heterogeneous architectures, with time-to-solution being an order of magnitude faster than state-of-the-art plane-wave DFT codes. We demonstrate an unprecedented sustained performance of 46 PFLOPS (27.8% peak FP64 performance) on a dislocation system in Magnesium containing 105,080 electrons using 3,800 GPU nodes of Summit supercomputer, which is the highest performance to-date among DFT codes.
Original language | English |
---|---|
Title of host publication | Proceedings of SC 2019 |
Subtitle of host publication | The International Conference for High Performance Computing, Networking, Storage and Analysis |
Publisher | IEEE Computer Society |
ISBN (Electronic) | 9781450362290 |
DOIs | |
State | Published - Nov 17 2019 |
Event | 2019 International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2019 - Denver, United States Duration: Nov 17 2019 → Nov 22 2019 |
Publication series
Name | International Conference for High Performance Computing, Networking, Storage and Analysis, SC |
---|---|
ISSN (Print) | 2167-4329 |
ISSN (Electronic) | 2167-4337 |
Conference
Conference | 2019 International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2019 |
---|---|
Country/Territory | United States |
City | Denver |
Period | 11/17/19 → 11/22/19 |
Funding
We gratefully acknowledge the support from DOE-BES (DE-SC0008637) and Toyota Research Institute. This work used resources of OLCF (DE-AC05-00OR22725), ALCF (DE-AC02-06CH11357), and NERSC (DE-AC02-05CH11231). V.G. also gratefully acknowledges support from AFOSR and ARO that supported algorithmic developments, and B.T. acknowledges support of LDRD program of ORNL.
Keywords
- Density functional theory
- Finite-elements
- Heterogeneous architectures
- Light-weight alloys
- Mixed precision
- Scalability