Abstract
Cancer is a leading cause of death in the US, and it results from a combination of two-nine genetic mutations. Identifying five-hit combinations responsible for several cancer types is computationally intractable even with the fastest super-computers in the USA. Iterating through nested loops required by the process presents a simplex-shaped workload with irregular memory access patterns. Distributing this workload efficiently across thousands of GPUs offers a challenge in dividing simplex-shaped (triangular/tetrahedral) workload into similar shapes with equal volume. Irregular memory access patterns create imbalanced compute utilization across nodes. We developed a generalized solution for distributing a simplex-shaped workload by partially coalescing the nested for-loops, minimizing the memory access overhead by efficiently utilizing limited shared memory, a dynamic scheduler, and loop tiling. For 4-hit combinations, we achieved a 90% - 100% strong scaling efficiency for up to 3594 V100 GPUs on the Summit supercomputer. Finally, we designed and implemented a distributed algorithm to identify 5-hit combinations for four different cancer types, and the identified combinations can differentiate between cancer and normal samples with 86.59-88.79% precision and 84.42 - 90.91% recall. We also demonstrated the robustness of our solution by porting the code to another leadership class computing platform Crusher, a testbed for the fastest supercomputer Frontier. On Crusher, we achieved 98% strong scaling efficiency on 50 nodes (400 AMD MI250X GCDs) and demonstrated the computational readiness of Frontier for scientific applications.
Original language | English |
---|---|
Title of host publication | Proceedings - 2023 IEEE International Parallel and Distributed Processing Symposium, IPDPS 2023 |
Publisher | Institute of Electrical and Electronics Engineers Inc. |
Pages | 974-984 |
Number of pages | 11 |
ISBN (Electronic) | 9798350337662 |
DOIs | |
State | Published - 2023 |
Event | 37th IEEE International Parallel and Distributed Processing Symposium, IPDPS 2023 - St. Petersburg, United States Duration: May 15 2023 → May 19 2023 |
Publication series
Name | Proceedings - 2023 IEEE International Parallel and Distributed Processing Symposium, IPDPS 2023 |
---|
Conference
Conference | 37th IEEE International Parallel and Distributed Processing Symposium, IPDPS 2023 |
---|---|
Country/Territory | United States |
City | St. Petersburg |
Period | 05/15/23 → 05/19/23 |
Funding
This work was supported by the resources of the Oak Ridge Leadership Computing Facility, located in the National Center for Computational Sciences at ORNL, which is managed by UT Battelle, LLC for the U.S. DOE (under the contract No. DE-AC05-00OR22725).
Keywords
- Cancer genomics
- nested loops
- scheduler
- simplex