Exaflops Biomedical Knowledge Graph Analytics

Ramakrishnan Kannan, Piyush Sao, Hao Lu, Jakub Kurzak, Gundolf Schenk, Yongmei Shi, Seung Hwan Lim, Sharat Israni, Vijay Thakkar, Guojing Cong, Robert Patton, Sergio E. Baranzini, Richard Vuduc, Thomas Potok

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

4 Scopus citations

Abstract

We are motivated by newly proposed methods for mining large-scale corpora of scholarly publications (e.g., full biomedical literature), which consists of tens of millions of papers spanning decades of research. In this setting, analysts seek to discover relationships among concepts. They construct graph representations from annotated text databases and then formulate the relationship-mining problem as an all-pairs shortest paths (APSP) and validate connective paths against curated biomedical knowledge graphs (e.g., Spoke). In this context, we present Coast (Exascale Communication-Optimized All-Pairs Shortest Path) and demonstrate 1.004 EF/s on 9,200 Frontier nodes (73,600 GCDs). We develop hyperbolic performance models (HYPERMOD), which guide optimizations and parametric tuning. The proposed Coast algorithm achieved the memory constant parallel efficiency of 99% in the single-precision tropical semiring. Looking forward, Coast will enable the integration of scholarly corpora like PubMed into the Spoke biomedical knowledge graph.

Original languageEnglish
Title of host publicationProceedings of SC 2022
Subtitle of host publicationInternational Conference for High Performance Computing, Networking, Storage and Analysis
PublisherIEEE Computer Society
ISBN (Electronic)9781665454445
DOIs
StatePublished - 2022
Event2022 International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2022 - Dallas, United States
Duration: Nov 13 2022Nov 18 2022

Publication series

NameInternational Conference for High Performance Computing, Networking, Storage and Analysis, SC
Volume2022-November
ISSN (Print)2167-4329
ISSN (Electronic)2167-4337

Conference

Conference2022 International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2022
Country/TerritoryUnited States
CityDallas
Period11/13/2211/18/22

Funding

This manuscript has been authored by UT-Battelle, LLC under Contract No. DE-AC05-00OR22725 with the US Department of Energy. This research used resources of the Oak Ridge Leadership Computing Facility, which is a DOE Office of Science User Facility. The United States Government retains and the publisher, by accepting the article for publication, acknowledges that the United States Government retains a non-exclusive, paid-up, irrevocable, world-wide license to publish or reproduce the published form of this manuscript or allow others to do so, for United States Government purposes. The Department of Energy will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan (http://energy.gov/ downloads/doe-public-access-plan). 1https://spoke.ucsf.edu This material is based upon work supported by the US Department of Energy (DOE), Office of Science, Office of Advanced Scientific Computing Research (Robinson Pino, program manager) under contract DE-AC05-00OR22725 and by the National Science Foundation (NSF) under award number 1710371. SPOKE development was funded in substantial part by the NSF Convergence Accelerator awards 1937160 and 12033569. This research used resources of the OLCF which is a DOE Office of Science User Facility supported under contract DE-AC05-00OR22725.

FundersFunder number
National Science Foundation1937160, 12033569, 1710371
U.S. Department of Energy
Office of Science
Advanced Scientific Computing ResearchDE-AC05-00OR22725

    Keywords

    • High-Performance Computing
    • Parallel Algorithms
    • Shortest Path Problem

    Fingerprint

    Dive into the research topics of 'Exaflops Biomedical Knowledge Graph Analytics'. Together they form a unique fingerprint.

    Cite this