Abstract
Abstract Community detection has become a fundamental operation in numerous graph-theoretic applications. It is used to reveal natural divisions that exist within real world networks without imposing prior size or cardinality constraints on the set of communities. Despite its potential for application, there is only limited support for community detection on large-scale parallel computers, largely owing to the irregular and inherently sequential nature of the underlying heuristics. In this paper, we present parallelization heuristics for fast community detection using the Louvain method as the serial template. The Louvain method is a multi-phase, iterative heuristic for modularity optimization. Originally developed by Blondel et al. (2008), the method has become increasingly popular owing to its ability to detect high modularity community partitions in a fast and memory-efficient manner. However, the method is also inherently sequential, thereby limiting its scalability. Here, we observe certain key properties of this method that present challenges for its parallelization, and consequently propose heuristics that are designed to break the sequential barrier. For evaluation purposes, we implemented our heuristics using OpenMP multithreading, and tested them over real world graphs derived from multiple application domains (e.g., internet, citation, biological). Compared to the serial Louvain implementation, our parallel implementation is able to produce community outputs with a higher modularity for most of the inputs tested, in comparable number or fewer iterations, while providing absolute speedups of up to 16× using 32 threads.
Original language | English |
---|---|
Article number | 2241 |
Pages (from-to) | 19-37 |
Number of pages | 19 |
Journal | Parallel Computing |
Volume | 47 |
DOIs | |
State | Published - Aug 4 2015 |
Externally published | Yes |
Funding
The authors would like to thank Drs. Emilie Hogan and Daniel Chavarría for input. The research was in part supported by DOE award DE-SC-0006516, NSF award IIS 0916463, and the Center for Adaptive Super Computing Software Multithreaded Architectures (CASS-MT) at the U.S. Department of Energy Pacific Northwest National Laboratory (PNNL). PNNL is operated by Battelle Memorial Institute under Contract DE-AC06-76RL01830. A preliminary version of this paper appeared in [11] .
Funders | Funder number |
---|---|
CASS-MT | |
U.S. Department of Energy Pacific Northwest National Laboratory | |
National Science Foundation | IIS 0916463 |
U.S. Department of Energy | DE-SC-0006516 |
Directorate for Computer and Information Science and Engineering | 0916463 |
Battelle | DE-AC06-76RL01830 |
Pacific Northwest National Laboratory |
Keywords
- Community detection
- Graph clustering
- Graph coloring
- Parallel graph heuristics