Abstract
Computational bioinformatics and biomedical applications frequently contain heterogeneously sized units of work or tasks, for instance due to variability in the sizes of biological sequences and molecules. Variable-sized workloads lead to load imbalances in parallel implementations which detract from efficiency and performance. Many modern computing resources now have multiple graphics processing units(GPUs) per computer for acceleration. These multiple GPU resources need to be used efficiently through balancing of workloads across the GPUs. OpenMP is a portable directive-based parallel programming API used ubiquitously in bioscience applications to program CPUs; recently, the use of OpenMP directives for GPU acceleration has become possible. Here, motivated by experiences with imbalanced loads in GPU-accelerated bioinformatics applications, we address the load balancing problem using OpenMP task-to-GPU scheduling combined with OpenMP GPU offloading for multiply heterogeneous workloads - loads with both variable input sizes, and simultaneously, variable convergence rates for algorithms with a stochastic component - scheduled across multiple GPUs. We aim to develop strategies which are both easy to use and have lower overheads, and may be incorporated incrementally in existing programs which already make use of OpenMP for CPU-based threading in order to make use of multi-GPU computers. We test different combinations of input size variability and convergence rate variability, and characterize the effects of these different scenarios on the performance of scheduling strategies across multiple GPUs with OpenMP. We present several dynamic scheduling solutions for different parallel patterns, explore optimizations, and provide publicly available example computational kernels to make these strategies easy to use in programs. This work will enable application developers to efficiently and easily use multiple GPUs for imbalanced workloads found in bioinformatics and biomedical applications.
| Original language | English |
|---|---|
| Title of host publication | Proceedings - 2021 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2021 |
| Editors | Yufei Huang, Lukasz Kurgan, Feng Luo, Xiaohua Tony Hu, Yidong Chen, Edward Dougherty, Andrzej Kloczkowski, Yaohang Li |
| Publisher | Institute of Electrical and Electronics Engineers Inc. |
| Pages | 1992-1999 |
| Number of pages | 8 |
| ISBN (Electronic) | 9781665401265 |
| DOIs | |
| State | Published - 2021 |
| Event | 2021 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2021 - Virtual, Online, United States Duration: Dec 9 2021 → Dec 12 2021 |
Publication series
| Name | Proceedings - 2021 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2021 |
|---|
Conference
| Conference | 2021 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2021 |
|---|---|
| Country/Territory | United States |
| City | Virtual, Online |
| Period | 12/9/21 → 12/12/21 |
Funding
ACKNOWLEDGEMENTS This research was supported by the Exascale Computing Project (17-SC-20-SC), a collaborative effort of the U.S. Department of Energy Office of Science and the National Nuclear Security Administration, in particular its subproject on Scaling OpenMP with LLVM for Exascale performance and portability (SOLLVE), and by the Laboratory Directed Research and Development Program at Oak Ridge National Laboratory (ORNL), and used resources of the Oak Ridge Leadership Computing Facility, which is a DOE Office of Science User Facility supported under Contract DE-AC05-00OR22725. We thank Deepak Empachati from Cray/HPE for providing guidance on efficient implementation of our strategies. We thank Abid Malik and Swaroop Pophale from SOLLVE for providing input on the benefits of a multi-GPU scheduling strategy. We thank Jianlin Cheng and Raj S. Roy at University of Missouri, Colombia for providing the data and the inspiration for the multi-GPU CCMPred problem.
Keywords
- OpenMP
- computational biology
- high performance computing
- load balancing
- multiple GPUs