Tartan: Evaluating Modern GPU Interconnect via a Multi-GPU Benchmark Suite

Ang Li, Shuaiwen Leon Song, Jieyang Chen, Xu Liu, Nathan Tallent, Kevin Barker

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

46 Scopus citations

Abstract

High performance multi-GPU computing becomes an inevitable trend due to the ever-increasing demand on computation capability in emerging domains such as deep learning, big data and planet-scale applications. However, the lack of deep understanding on how modern GPUs can be connected and the actual impact of state-of-the-art interconnect on multiGPU application performance becomes a hurdle. Additionally, the absence of a practical multi-GPU benchmark suite poses further obstacles for conducting research in multi-GPU era. In this paper, we fill the gap by proposing a multi-GPU benchmark suite named Tartan, which contains microbenchmarks, scale-up and scale-out applications. We then apply Tartan to evaluate the four latest types of modern GPU interconnects, i.e., PCI-e, NVLink-V1, NVLink-V2 and InfiniBand with GPUDirect-RDMA from two recently released NVIDIA super AI platforms as well as ORNL's exascale prototype system. Based on empirical evaluation, we observe four new types of NUMA effects: three types are triggered by NVLink's topology, connectivity and routing, while one type is caused by PCI-e (i.e., anti-locality). They are very important for performance tuning in multi-GPU environment. Our evaluation results show that, unless the current CPU-GPU master-slave programming model can be replaced, it is difficult for scale-up multi-GPU applications to really benefit from faster intra-node interconnects such as NVLinks; while for inter-node scale-out applications, although interconnect is more crucial to the overall performance, GPUDirect-RDMA appears to be not always the optimal choice. The Tartan benchmark suite including the microbenchmarks are opensource and available athttp://github.com/uuudown/Tartan.

Original languageEnglish
Title of host publication2018 IEEE International Symposium on Workload Characterization, IISWC 2018
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages191-202
Number of pages12
ISBN (Electronic)9781538667804
DOIs
StatePublished - Dec 11 2018
Externally publishedYes
Event2018 IEEE International Symposium on Workload Characterization, IISWC 2018 - Raleigh, United States
Duration: Sep 30 2018Oct 2 2018

Publication series

Name2018 IEEE International Symposium on Workload Characterization, IISWC 2018

Conference

Conference2018 IEEE International Symposium on Workload Characterization, IISWC 2018
Country/TerritoryUnited States
CityRaleigh
Period09/30/1810/2/18

Funding

We thank all the anonymous reviewers for their constructive comments and suggestions for improving this work. This research was supported by the U.S. DOE Office of Science, Office of Advanced Scientific Computing Research, under award 66150: "CENATE - Center for Advanced Architecture Evaluation". This research was also supported by ECP Application Assessment program within the Exascale Computing Project (17-SC-20-SC), a joint project of the U.S. Department of Energy's Office of Science and National Nuclear Security Administration, responsible for delivering a capable exascale ecosystem, including software, applications, and hardware technology, to support the nation's exascale computing imperative. This research used resources of the Oak Ridge Leadership Computing Facility at the Oak Ridge National Laboratory, which is supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC05-00OR22725. The Pacific Northwest National Laboratory (PNNL) is operated by Battelle for the U.S. Department of Energy under contract DE-AC05-76RL01830. ACKNOWLEDGMENT We thank all the anonymous reviewers for their constructive comments and suggestions for improving this work. This research was supported by the U.S. DOE Office of Science, Office of Advanced Scientific Computing Research, under award 66150: “CENATE - Center for Advanced Architecture Evaluation”. This research was also supported by ECP Application Assessment program within the Exascale Computing Project (17-SC-20-SC), a joint project of the U.S. Department of Energy’s Office of Science and National Nuclear Security Administration, responsible for delivering a capable exascale ecosystem, including software, applications, and hardware technology, to support the nation’s exascale computing imperative. This research used resources of the Oak Ridge Leadership Computing Facility at the Oak Ridge National Laboratory, which is supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC05-00OR22725. The Pacific Northwest National Laboratory (PNNL) is operated by Battelle for the U.S. Department of Energy under contract DE-AC05-76RL01830.

FundersFunder number
ECP Application Assessment program17-SC-20-SC
DOE Office of Science
U.S. Department of Energy
Office of ScienceDE-AC05-76RL01830
National Nuclear Security Administration
Advanced Scientific Computing Research66150
Pacific Northwest National Laboratory

    Fingerprint

    Dive into the research topics of 'Tartan: Evaluating Modern GPU Interconnect via a Multi-GPU Benchmark Suite'. Together they form a unique fingerprint.

    Cite this