Abstract
Graph representations are pervasive in scientific and social computing. They serve as vital tools to model the interplay among different interacting entities. In this paper, we visit the problem of community detection, which is one of the most widely used graph operations toward scientific discovery. Community detection refers to the process of identifying tightly-knit subgroups of vertices in a large graph. These sub-groups (or communities) represent vertices that are tied together through common structure or function. Identification of communities could help in understanding the modular organization of complex networks. However, owing to large data sizes and high computational costs, performing community detection at scale has become increasingly challenging. Here, we present a detailed review and analysis of some of the leading computational methods and implementations developed for executing community detection on modern day multicore and manycore architectures. Our goals are to: a) define the problem of community detection and highlight its scientific significance; b) relate to challenges in parallelizing the operation on modern day architectures; c) provide a detailed report and logical organization of the approaches that have been designed for various architectures; and d) finally, provide insights into the strengths and suitability of different architectures for community detection, and a preview into the future trends of the area. It is our hope that this detailed treatment of community detection on parallel architectures can serve as an exemplar study for extending the application of modern day multicore and manycore architectures to other complex graph applications.
Original language | English |
---|---|
Pages (from-to) | 145-247 |
Number of pages | 103 |
Journal | Foundations and Trends in Electronic Design Automation |
Volume | 10 |
Issue number | 3 |
DOIs | |
State | Published - 2016 |
Externally published | Yes |
Funding
This work was supported in part by U.S. Department of Energy (DOE) Award DE-SC-0006516, by the US National Science Foundation (NSF) grants CCF-0845504, CNS-1059289, CNS-1128624, and CCF-1162202, an Army Research Office grant W911NF-12-1-0373. This work was also supported in part by the Applied Mathematics Program of the Office of Advance Scientific Computing Research within the Office of Science of the U.S. Department of Energy (DOE), and by the Integrated Compiler and Runtime Autotuning Infrastructure (ATPER) project at Pacific Northwest National Laboratory (PNNL). PNNL is operated by Battelle for the DOE under Contract DE-AC05-76RL01830.
Funders | Funder number |
---|---|
Office of Advance Scientific Computing Research | |
National Science Foundation | CNS-1059289, CNS-1128624, CCF-0845504, CCF-1162202 |
U.S. Department of Energy | DE-SC-0006516 |
Army Research Office | W911NF-12-1-0373 |
Pacific Northwest National Laboratory | DE-AC05-76RL01830 |