Abstract
Diffusion Monte Carlo is a highly accurate Quantum Monte Carlo method for electronic structure calculations of materials, but it requires frequent load balancing or population redistribution steps to maintain efficiency on parallel machines. This step can be a significant factor affecting performance, and will become more important as the number of processing elements increases. We propose a new dynamic load balancing algorithm, the Alias Method, and evaluate it theoretically and empirically. An important feature of the new algorithm is that the load can be perfectly balanced with each process receiving at most one message. It is also optimal in the maximum size of messages received by any process. We also optimize its implementation to reduce network contention, a process facilitated by the low messaging requirement of the algorithm: a simple renumbering of the MPI ranks based on proximity and a space filling curve significantly improves the MPI Allgather performance. Empirical results on the petaflop Cray XT Jaguar supercomputer at ORNL show up to 30% improvement in performance on 120,000 cores. The load balancing algorithm may be straightforwardly implemented in existing codes. The algorithm may also be employed by any method with many near identical computational tasks that require load balancing.
Original language | English |
---|---|
Pages (from-to) | 284-292 |
Number of pages | 9 |
Journal | Computer Physics Communications |
Volume | 184 |
Issue number | 2 |
DOIs | |
State | Published - Feb 2013 |
Funding
We acknowledge the ORAU/ORNL HPC program for partial funding, and the INCITE and XSEDE programs for computing time. Research by PRCK was conducted at the Center for Nanophase Materials Sciences, which is sponsored at Oak Ridge National Laboratory by the Scientific User Facilities Division, U.S. Department of Energy.
Funders | Funder number |
---|---|
ORAU/ORNL |
Keywords
- Load balancing
- Parallel computing
- Quantum Monte Carlo