TY - GEN
T1 - Distributing Simplex-Shaped Nested for-Loops to Identify Carcinogenic Gene Combinations
AU - Dash, Sajal
AU - Haque Monil, Mohammad Alaul
AU - Yin, Junqi
AU - Anandakrishnan, Ramu
AU - Wang, Feiyi
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - Cancer is a leading cause of death in the US, and it results from a combination of two-nine genetic mutations. Identifying five-hit combinations responsible for several cancer types is computationally intractable even with the fastest super-computers in the USA. Iterating through nested loops required by the process presents a simplex-shaped workload with irregular memory access patterns. Distributing this workload efficiently across thousands of GPUs offers a challenge in dividing simplex-shaped (triangular/tetrahedral) workload into similar shapes with equal volume. Irregular memory access patterns create imbalanced compute utilization across nodes. We developed a generalized solution for distributing a simplex-shaped workload by partially coalescing the nested for-loops, minimizing the memory access overhead by efficiently utilizing limited shared memory, a dynamic scheduler, and loop tiling. For 4-hit combinations, we achieved a 90% - 100% strong scaling efficiency for up to 3594 V100 GPUs on the Summit supercomputer. Finally, we designed and implemented a distributed algorithm to identify 5-hit combinations for four different cancer types, and the identified combinations can differentiate between cancer and normal samples with 86.59-88.79% precision and 84.42 - 90.91% recall. We also demonstrated the robustness of our solution by porting the code to another leadership class computing platform Crusher, a testbed for the fastest supercomputer Frontier. On Crusher, we achieved 98% strong scaling efficiency on 50 nodes (400 AMD MI250X GCDs) and demonstrated the computational readiness of Frontier for scientific applications.
AB - Cancer is a leading cause of death in the US, and it results from a combination of two-nine genetic mutations. Identifying five-hit combinations responsible for several cancer types is computationally intractable even with the fastest super-computers in the USA. Iterating through nested loops required by the process presents a simplex-shaped workload with irregular memory access patterns. Distributing this workload efficiently across thousands of GPUs offers a challenge in dividing simplex-shaped (triangular/tetrahedral) workload into similar shapes with equal volume. Irregular memory access patterns create imbalanced compute utilization across nodes. We developed a generalized solution for distributing a simplex-shaped workload by partially coalescing the nested for-loops, minimizing the memory access overhead by efficiently utilizing limited shared memory, a dynamic scheduler, and loop tiling. For 4-hit combinations, we achieved a 90% - 100% strong scaling efficiency for up to 3594 V100 GPUs on the Summit supercomputer. Finally, we designed and implemented a distributed algorithm to identify 5-hit combinations for four different cancer types, and the identified combinations can differentiate between cancer and normal samples with 86.59-88.79% precision and 84.42 - 90.91% recall. We also demonstrated the robustness of our solution by porting the code to another leadership class computing platform Crusher, a testbed for the fastest supercomputer Frontier. On Crusher, we achieved 98% strong scaling efficiency on 50 nodes (400 AMD MI250X GCDs) and demonstrated the computational readiness of Frontier for scientific applications.
KW - Cancer genomics
KW - nested loops
KW - scheduler
KW - simplex
UR - http://www.scopus.com/inward/record.url?scp=85166636620&partnerID=8YFLogxK
U2 - 10.1109/IPDPS54959.2023.00101
DO - 10.1109/IPDPS54959.2023.00101
M3 - Conference contribution
AN - SCOPUS:85166636620
T3 - Proceedings - 2023 IEEE International Parallel and Distributed Processing Symposium, IPDPS 2023
SP - 974
EP - 984
BT - Proceedings - 2023 IEEE International Parallel and Distributed Processing Symposium, IPDPS 2023
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 37th IEEE International Parallel and Distributed Processing Symposium, IPDPS 2023
Y2 - 15 May 2023 through 19 May 2023
ER -