Abstract
Community detection (or clustering) in large-scale graphs is an important problem in graph mining. Communities reveal interesting organizational and functional characteristics of a network. Louvain algorithm is an efficient sequential algorithm for community detection. However, such sequential algorithms fail to scale for emerging large-scale data. Scalable parallel algorithms are necessary to process large graph datasets. In this work, we show a comparative analysis of our different parallel implementations of Louvain algorithm. We design parallel algorithms for Louvain method in shared memory and distributed memory settings. Developing distributed memory parallel algorithms is challenging because of inter-process communication and load balancing issues. We incorporate dynamic load balancing in our final algorithm DPLAL (Distributed Parallel Louvain Algorithm with Load-balancing). DPLAL overcomes the performance bottleneck of the previous algorithms and shows around 12-fold speedup scaling to a larger number of processors. We also compare the performance of our algorithm with some other prominent algorithms in the literature and get better or comparable performance. We identify the challenges in developing distributed memory algorithm and provide an optimized solution DPLAL showing performance analysis of the algorithm on large-scale real-world networks from different domains.
Original language | English |
---|---|
Pages (from-to) | 10275-10309 |
Number of pages | 35 |
Journal | Journal of Supercomputing |
Volume | 78 |
Issue number | 7 |
DOIs | |
State | Published - May 2022 |
Externally published | Yes |
Funding
This work has been supported by Louisiana Board of Regents RCS Grant LEQSF (2017-20)-RDA-25. We also thank the anonymous reviewers for the helpful comments and suggestions to improve this paper.
Funders | Funder number |
---|---|
Louisiana Board of Regents | LEQSF (2017-20)-RDA-25 |
Keywords
- Community detection
- Graph mining
- Load balancing
- Louvain method
- MPI
- OpenMP
- Parallel algorithms