Distributed Multi-GPU Community Detection on Exascale Computing Platforms

Naw Safrin Sattar, Hao Lu, Feiyi Wang, Mahantesh Halappanavar

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

4 Scopus citations

Abstract

Community detection is a fundamental operation in graph mining, and by uncovering hidden structures and patterns within complex systems it helps solve fundamental problems pertaining to social networks, such as information diffusion, epidemics, and recommender systems. Scaling graph algorithms for massive networks becomes challenging on modern distributed-memory multi-GPU (Graphics Processing Unit) systems due to limitations such as irregular memory access patterns, load imbalances, higher communication-computation ratios, and cross-platform support. We present a novel algorithm HiPDPL-GPU (Distributed Parallel Louvain) to address these challenges. We conduct experiments involving different partitioning techniques to achieve an optimized performance of HiPDPL-GPU on the two largest supercomputers: Frontier and Summit. Remarkably, HiPDPL-GPU processes a graph with 4.2 billion edges in less than 3 minutes using 1024 GPUs. Qualitatively, the performance of HiPDPL-GPU is similar or better compared to other state-of-the-art CPU- and GPU-based implementations. While prior GPU implementations have predominantly employed CUDA, our first-of-its-kind implementation for community detection is cross-platform, accommodating both AMD and NVIDIA GPUs.

Original languageEnglish
Title of host publication2024 IEEE International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2024
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages815-824
Number of pages10
ISBN (Electronic)9798350364606
DOIs
StatePublished - 2024
Event2024 IEEE International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2024 - San Francisco, United States
Duration: May 27 2024May 31 2024

Publication series

Name2024 IEEE International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2024

Conference

Conference2024 IEEE International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2024
Country/TerritoryUnited States
CitySan Francisco
Period05/27/2405/31/24

Keywords

  • clustering
  • community detection
  • HIP
  • hybrid
  • Louvain
  • MPI
  • multi-GPU

Fingerprint

Dive into the research topics of 'Distributed Multi-GPU Community Detection on Exascale Computing Platforms'. Together they form a unique fingerprint.

Cite this