Towards efficient mapreduce using MPI

Torsten Hoefler, Andrew Lumsdaine, Jack Dongarra

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

59 Scopus citations

Abstract

MapReduce is an emerging programming paradigm for data-parallel applications. We discuss common strategies to implement a MapReduce runtime and propose an optimized implementation on top of MPI. Our implementation combines redistribution and reduce and moves them into the network. This approach especially benefits applications with a limited number of output keys in the map phase. We also show how anticipated MPI-2.2 and MPI-3 features, such as MPI-Reduce-local and nonblocking collective operations, can be used to implement and optimize MapReduce with a performance improvement of up to 25% on 127 cluster nodes. Finally, we discuss additional features that would enable MPI to more efficiently support all MapReduce applications.

Original languageEnglish
Title of host publicationRecent Advances in Parallel Virtual Machine and Message Passing Interface - 16th European PVM/MPI Users' Group Meeting, Proceedings
PublisherSpringer Verlag
Pages240-249
Number of pages10
ISBN (Print)3642037690, 9783642037696
DOIs
StatePublished - 2009
Externally publishedYes
Event16th European Parallel Virtual Machine and Message Passing Interface Users' Group Meeting, EuroPVM/MPI - Espoo, Finland
Duration: Sep 7 2009Sep 10 2009

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume5759 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference16th European Parallel Virtual Machine and Message Passing Interface Users' Group Meeting, EuroPVM/MPI
Country/TerritoryFinland
CityEspoo
Period09/7/0909/10/09

Fingerprint

Dive into the research topics of 'Towards efficient mapreduce using MPI'. Together they form a unique fingerprint.

Cite this