TY - GEN
T1 - Graph Processing Platforms at Scale
T2 - 2015 15th IEEE International Symposium on Performance Analysis of Systems and Software, ISPASS 2015
AU - Lim, Seung Hwan
AU - Lee, Sangkeun
AU - Ganesh, Gautam
AU - Brown, Tyler C.
AU - Sukumar, Sreenivas R.
N1 - Publisher Copyright:
© 2015 IEEE.
PY - 2015/4/27
Y1 - 2015/4/27
N2 - Graph analysis has revealed patterns and relationships hidden in data from a variety of domains such as transportation networks, social networks, clinical pathways, and collaboration networks. As these networks grow in size, variety and complexity, it is a challenge to find the right combination of tools and implementation of algorithms to discover new insights from the data. Addressing this challenge, our study presents an extensive empirical evaluation of three representative graph processing platforms: Pegasus, GraphX, and Urika. Each system represents a combination of options in data model, processing paradigm, and infrastructure. We benchmark each platform using three popular graph mining operations, degree distribution, connected components, and PageRank over real-world graphs. Our experiments show that each graph processing platform owns a particular strength for different types of graph operations. While Urika performs the best in non-iterative graph operations like degree distribution, GraphX outperforms iterative operations like connected components and PageRank. We conclude this paper by discussing options to optimize the performance of a graph-theoretic operation on each platform for large-scale real world graphs.
AB - Graph analysis has revealed patterns and relationships hidden in data from a variety of domains such as transportation networks, social networks, clinical pathways, and collaboration networks. As these networks grow in size, variety and complexity, it is a challenge to find the right combination of tools and implementation of algorithms to discover new insights from the data. Addressing this challenge, our study presents an extensive empirical evaluation of three representative graph processing platforms: Pegasus, GraphX, and Urika. Each system represents a combination of options in data model, processing paradigm, and infrastructure. We benchmark each platform using three popular graph mining operations, degree distribution, connected components, and PageRank over real-world graphs. Our experiments show that each graph processing platform owns a particular strength for different types of graph operations. While Urika performs the best in non-iterative graph operations like degree distribution, GraphX outperforms iterative operations like connected components and PageRank. We conclude this paper by discussing options to optimize the performance of a graph-theoretic operation on each platform for large-scale real world graphs.
UR - http://www.scopus.com/inward/record.url?scp=84937416105&partnerID=8YFLogxK
U2 - 10.1109/ISPASS.2015.7095783
DO - 10.1109/ISPASS.2015.7095783
M3 - Conference contribution
AN - SCOPUS:84937416105
T3 - ISPASS 2015 - IEEE International Symposium on Performance Analysis of Systems and Software
SP - 42
EP - 51
BT - ISPASS 2015 - IEEE International Symposium on Performance Analysis of Systems and Software
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 29 March 2015 through 31 March 2015
ER -