TY - GEN
T1 - Exaflops Biomedical Knowledge Graph Analytics
AU - Kannan, Ramakrishnan
AU - Sao, Piyush
AU - Lu, Hao
AU - Kurzak, Jakub
AU - Schenk, Gundolf
AU - Shi, Yongmei
AU - Lim, Seung Hwan
AU - Israni, Sharat
AU - Thakkar, Vijay
AU - Cong, Guojing
AU - Patton, Robert
AU - Baranzini, Sergio E.
AU - Vuduc, Richard
AU - Potok, Thomas
N1 - Publisher Copyright:
© 2022 IEEE.
PY - 2022
Y1 - 2022
N2 - We are motivated by newly proposed methods for mining large-scale corpora of scholarly publications (e.g., full biomedical literature), which consists of tens of millions of papers spanning decades of research. In this setting, analysts seek to discover relationships among concepts. They construct graph representations from annotated text databases and then formulate the relationship-mining problem as an all-pairs shortest paths (APSP) and validate connective paths against curated biomedical knowledge graphs (e.g., Spoke). In this context, we present Coast (Exascale Communication-Optimized All-Pairs Shortest Path) and demonstrate 1.004 EF/s on 9,200 Frontier nodes (73,600 GCDs). We develop hyperbolic performance models (HYPERMOD), which guide optimizations and parametric tuning. The proposed Coast algorithm achieved the memory constant parallel efficiency of 99% in the single-precision tropical semiring. Looking forward, Coast will enable the integration of scholarly corpora like PubMed into the Spoke biomedical knowledge graph.
AB - We are motivated by newly proposed methods for mining large-scale corpora of scholarly publications (e.g., full biomedical literature), which consists of tens of millions of papers spanning decades of research. In this setting, analysts seek to discover relationships among concepts. They construct graph representations from annotated text databases and then formulate the relationship-mining problem as an all-pairs shortest paths (APSP) and validate connective paths against curated biomedical knowledge graphs (e.g., Spoke). In this context, we present Coast (Exascale Communication-Optimized All-Pairs Shortest Path) and demonstrate 1.004 EF/s on 9,200 Frontier nodes (73,600 GCDs). We develop hyperbolic performance models (HYPERMOD), which guide optimizations and parametric tuning. The proposed Coast algorithm achieved the memory constant parallel efficiency of 99% in the single-precision tropical semiring. Looking forward, Coast will enable the integration of scholarly corpora like PubMed into the Spoke biomedical knowledge graph.
KW - High-Performance Computing
KW - Parallel Algorithms
KW - Shortest Path Problem
UR - http://www.scopus.com/inward/record.url?scp=85149282520&partnerID=8YFLogxK
U2 - 10.1109/SC41404.2022.00011
DO - 10.1109/SC41404.2022.00011
M3 - Conference contribution
AN - SCOPUS:85149282520
T3 - International Conference for High Performance Computing, Networking, Storage and Analysis, SC
BT - Proceedings of SC 2022
PB - IEEE Computer Society
T2 - 2022 International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2022
Y2 - 13 November 2022 through 18 November 2022
ER -