Kernel assisted collective intra-node MPI communication among multi-core and many-core CPUs

Teng Ma, George Bosilca, Aurelien Bouteiller, Brice Goglin, Jeffrey M. Squyres, Jack J. Dongarra

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

36 Scopus citations

Abstract

Shared memory is among the most common approaches to implementing message passing within multi-core nodes. However, current shared memory techniques do not scale with increasing numbers of cores and expanding memory hierarchies - most notably when handling large data transfers and collective communication. Neglecting the underlying hardware topology, using copy-in/copy-out memory transfer operations, and overloading the memory subsystem using one-to-many types of operations are some of the most common mistakes in today's shared memory implementations. Unfortunately, they all negatively impact the performance and scalability of MPI libraries - and therefore applications. In this paper, we present several kernel-assisted intra-node collective communication techniques that address these three issues on many-core systems. We also present a new Open MPI collective communication component that uses the KNEM Linux module for direct inter-process memory copying. Our Open MPI component implements several novel strategies to decrease the number of intermediate memory copies and improve data locality in order to diminish both cache pollution and memory pressure. Experimental results show that our KNEM-enabled Open MPI collective component can outperform state-of-art MPI libraries (Open MPI and MPICH2) on synthetic benchmarks, resulting in a significant improvement for a typical graph application.

Original languageEnglish
Title of host publicationProceedings - 2011 International Conference on Parallel Processing, ICPP 2011
Pages532-541
Number of pages10
DOIs
StatePublished - 2011
Event40th International Conference on Parallel Processing, ICPP 2011 - Taipei City, Taiwan, Province of China
Duration: Sep 13 2011Sep 16 2011

Publication series

NameProceedings of the International Conference on Parallel Processing
ISSN (Print)0190-3918

Conference

Conference40th International Conference on Parallel Processing, ICPP 2011
Country/TerritoryTaiwan, Province of China
CityTaipei City
Period09/13/1109/16/11

Keywords

  • Collective communication
  • Kernel
  • MPI
  • Many-core
  • Multi-core
  • NUMA
  • Shared memory

Fingerprint

Dive into the research topics of 'Kernel assisted collective intra-node MPI communication among multi-core and many-core CPUs'. Together they form a unique fingerprint.

Cite this