DyG-DPCD: A Distributed Parallel Community Detection Algorithm for Large-Scale Dynamic Graphs

Naw Safrin Sattar, Khaled Z. Ibrahim, Aydin Buluc, Shaikh Arifuzzaman

Research output: Contribution to journalArticlepeer-review

Abstract

Dynamic (Temporal) graphs capture the valuable evolution of real-world systems, from the continuously evolving patterns of social interactions and genetic pathways to the dynamic fluctuations of economic forces. Detecting communities for such evolving networks poses unique challenges. Detecting and analyzing the evolution of communities within dynamic graphs unlocks valuable insights into the underlying structural and temporal patterns of real-world systems. However, the sheer volume of modern graph data and the inherent complexity of the temporal dimension pose significant challenges to scalable community detection algorithms. Addressing this gap, our work explores the limited landscape of scalable distributed-memory parallel methods specifically designed for dynamic network community detection. We propose a novel parallel algorithm, DyG-DPCD (Dynamic Graph Distributed Parallel Community Detection), to detect communities in dynamic networks using the Message Passing Interface (MPI) framework. We present a vertex-centric approach, allowing us to detect communities through local optimization. Furthermore, we enhance our baseline algorithm by incorporating three heuristics, which improve the algorithm’s performance significantly while maintaining the quality of the solutions. We demonstrate the efficiency of our algorithm by experimenting on several real-world large-scale networks with hundreds of millions of edges spanning diverse domains. Notably, DyG-DPCD achieves speedups between 25× and 30× for large networks that we experimented on using NERSC compute nodes. Our algorithm outperforms the STINGER parallel re-agglomeration algorithm by 30×.

Original languageEnglish
Article number4
JournalInternational Journal of Parallel Programming
Volume53
Issue number1
DOIs
StatePublished - Feb 2025

Funding

This material is based upon work supported by the National Science Foundation under Grant Number 2323533. The work is also partially supported by Lawrence Berkeley National Laboratory under Contract No. DE-AC02-05CH11231 with the U.S. Department of Energy and the Exascale Computing Project (17-SC-20-SC), a joint project of the U.S. Dept. Of Energy and National Nuclear Security Administration. This manuscript has been authored by employees of UT-Battelle, LLC, under Contract No. DE-AC05-00OR22725, and Lawrence Berkeley National Laboratory under Contract No. DE-AC02-05CH11231 with the U.S. Department of Energy (DOE). The U.S. Government retains, and the publisher, by accepting the article for publication, acknowledges, that the U.S. Government retains a non-exclusive, paid-up, irrevocable, world-wide license to publish or reproduce the published form of this manuscript, or allow others to do so, for U.S. Government purposes.

FundersFunder number
National Nuclear Security Administration
U.S. Government
Lawrence Berkeley National LaboratoryDE-AC02-05CH11231
Lawrence Berkeley National Laboratory
National Science Foundation2323533
National Science Foundation
U.S. Department of Energy17-SC-20-SC
U.S. Department of Energy

    Keywords

    • Community detection
    • Distributed-memory
    • Dynamic graphs
    • MPI
    • Parallel algorithm
    • Temporal graphs

    Fingerprint

    Dive into the research topics of 'DyG-DPCD: A Distributed Parallel Community Detection Algorithm for Large-Scale Dynamic Graphs'. Together they form a unique fingerprint.

    Cite this