TensorFlow Doing HPC: An Evaluation of TensorFlow Performance in HPC applications

Steven W.D. Chien, Stefano Markidis, Vyacheslav Olshevsky, Yaroslav Bulatov, Erwin Laure, Jeffrey Vetter

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

10 Scopus citations

Abstract

TensorFlow is a popular emerging open-source programming framework supporting the execution of distributed applications on heterogeneous hardware. While TensorFlow has been initially designed for developing Machine Learning (ML) applications, in fact TensorFlow aims at supporting the development of a much broader range of application kinds that are outside the ML domain and can possibly include HPC applications. However, very few experiments have been conducted to evaluate TensorFlow performance when running HPC workloads on supercomputers. This work addresses this lack by designing four traditional HPC benchmark applications: STREAM, matrix-matrix multiply, Conjugate Gradient (CG) solver and Fast Fourier Transform (FFT). We analyze their performance on two supercomputers with accelerators and evaluate the potential of TensorFlow for developing HPC applications. Our tests show that TensorFlow can fully take advantage of high performance networks and accelerators on supercomputers. Running our TensorFlow STREAM benchmark, we obtain over 50% of theoretical communication bandwidth on our testing platform. We find an approximately 2×, 1.7× and 1.8× performance improvement when increasing the number of GPUs from two to four in the matrix-matrix multiply, CG and FFT applications respectively. All our performance results demonstrate that TensorFlow has high potential of emerging also as HPC programming framework for heterogeneous supercomputers.

Original languageEnglish
Title of host publicationProceedings - 2019 IEEE 33rd International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2019
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages509-518
Number of pages10
ISBN (Electronic)9781728135106
DOIs
StatePublished - May 2019
Event33rd IEEE International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2019 - Rio de Janeiro, Brazil
Duration: May 20 2019May 24 2019

Publication series

NameProceedings - 2019 IEEE 33rd International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2019

Conference

Conference33rd IEEE International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2019
Country/TerritoryBrazil
CityRio de Janeiro
Period05/20/1905/24/19

Funding

ACKNOWLEDGMENT Funding for the work is received from the European Commission H2020 program, Grant Agreement No. 801039 (EPiGRAM-HS). Experiments were performed on resources provided by the Swedish National Infrastructure for Computing (SNIC) at PDC Center for High Performance Computing and HPC2N.

FundersFunder number
EPiGRAM-HS
European Commission H2020 program
Horizon 2020 Framework Programme801039

    Keywords

    • Emerging Programming Environment
    • HPC Applications
    • Heterogeneous Supercomputers
    • Parallel Computing
    • TensorFlow

    Fingerprint

    Dive into the research topics of 'TensorFlow Doing HPC: An Evaluation of TensorFlow Performance in HPC applications'. Together they form a unique fingerprint.

    Cite this