TY - GEN
T1 - Towards efficient mapreduce using MPI
AU - Hoefler, Torsten
AU - Lumsdaine, Andrew
AU - Dongarra, Jack
PY - 2009
Y1 - 2009
N2 - MapReduce is an emerging programming paradigm for data-parallel applications. We discuss common strategies to implement a MapReduce runtime and propose an optimized implementation on top of MPI. Our implementation combines redistribution and reduce and moves them into the network. This approach especially benefits applications with a limited number of output keys in the map phase. We also show how anticipated MPI-2.2 and MPI-3 features, such as MPI-Reduce-local and nonblocking collective operations, can be used to implement and optimize MapReduce with a performance improvement of up to 25% on 127 cluster nodes. Finally, we discuss additional features that would enable MPI to more efficiently support all MapReduce applications.
AB - MapReduce is an emerging programming paradigm for data-parallel applications. We discuss common strategies to implement a MapReduce runtime and propose an optimized implementation on top of MPI. Our implementation combines redistribution and reduce and moves them into the network. This approach especially benefits applications with a limited number of output keys in the map phase. We also show how anticipated MPI-2.2 and MPI-3 features, such as MPI-Reduce-local and nonblocking collective operations, can be used to implement and optimize MapReduce with a performance improvement of up to 25% on 127 cluster nodes. Finally, we discuss additional features that would enable MPI to more efficiently support all MapReduce applications.
UR - http://www.scopus.com/inward/record.url?scp=70350443831&partnerID=8YFLogxK
U2 - 10.1007/978-3-642-03770-2_30
DO - 10.1007/978-3-642-03770-2_30
M3 - Conference contribution
AN - SCOPUS:70350443831
SN - 3642037690
SN - 9783642037696
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 240
EP - 249
BT - Recent Advances in Parallel Virtual Machine and Message Passing Interface - 16th European PVM/MPI Users' Group Meeting, Proceedings
PB - Springer Verlag
T2 - 16th European Parallel Virtual Machine and Message Passing Interface Users' Group Meeting, EuroPVM/MPI
Y2 - 7 September 2009 through 10 September 2009
ER -