Abstract
The Density Matrix Renormalization Group (DMRG++) is a condensed matter physics application used to study superconductivity properties of materials. It’s main computations consist of calculating hamiltonian matrix which requires sparse matrix-vector multiplications. This paper presents task-based parallelization and optimization strategies of the Hamiltonian algorithm. The algorithm is implemented as a mini-application in C++ and parallelized with OpenMP. The optimization leverages tasking features, such as dependencies or priorities included in the OpenMP standard 4.5. The code refactoring targets performance as much as programmability. The optimized version achieves a speedup of 8.0 × with 8 threads and 20.5 × with 40 threads on a Power9 computing node while reducing the memory consumption to 90 MB with respect to the original code, by adding less than ten OpenMP directives.
Original language | English |
---|---|
Title of host publication | OpenMP |
Subtitle of host publication | Conquering the Full Hardware Spectrum - 15th International Workshop on OpenMP, IWOMP 2019, Proceedings |
Editors | Xing Fan, Oliver Sinnen, Nasser Giacaman, Bronis R. de Supinski |
Publisher | Springer Verlag |
Pages | 291-305 |
Number of pages | 15 |
ISBN (Print) | 9783030285951 |
DOIs | |
State | Published - 2019 |
Event | 15th International Workshop on OpenMP, IWOMP 2019 - Auckland, New Zealand Duration: Sep 11 2019 → Sep 13 2019 |
Publication series
Name | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
---|---|
Volume | 11718 LNCS |
ISSN (Print) | 0302-9743 |
ISSN (Electronic) | 1611-3349 |
Conference
Conference | 15th International Workshop on OpenMP, IWOMP 2019 |
---|---|
Country/Territory | New Zealand |
City | Auckland |
Period | 09/11/19 → 09/13/19 |
Funding
This work is partially supported by the Spanish Government through Programa Severo Ochoa (SEV-2015-0493), by the Spanish Ministry of Science and Technology (project TIN2015-65316-P), by the Generalitat de Catalunya (contract 2017-SGR-1414) and by the BSC-IBM Deep Learning Research Agreement, under JSA “Application porting, analysis and optimization for POWER and POWER AI”. This work was partially supported by the Scientific Discovery through Advanced Computing (SciDAC) program funded by U.S. Department of Energy, Office of Science, Advanced Scientific Computing Research and Basic Energy Sciences, Division of Materials Sciences and Engineering. This research used resources of the Oak Ridge Leadership Computing Facility at the Oak Ridge National Laboratory, which is supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC05-00OR22725.
Keywords
- Analysis
- Dependencies
- OpenMP
- Optimization
- Tasks