Learning concave-convex profiles of data transport over dedicated connections

Nageswara S.V. Rao, Satyabrata Sen, Zhengchun Liu, Rajkumar Kettimuthu, Ian Foster

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

10 Scopus citations

Abstract

Dedicated data transport infrastructures are increasingly being deployed to support distributed big-data and high-performance computing scenarios. These infrastructures employ data transfer nodes that use sophisticated software stacks to support network transport among sites, which often house distributed file and storage systems. Throughput measurements collected over such infrastructures for a range of round trip times (RTTs) reflect the underlying complex end-to-end connections, and have revealed dichotomous throughput profiles as functions of RTT. In particular, concave regions of throughput profiles at lower RTTs indicate near-optimal performance, and convex regions at higher RTTs indicate bottlenecks due to factors such as buffer or credit limits. We present a machine learning method that explicitly infers these concave and convex regions and transitions between them using sigmoid functions. We also provide distribution-free confidence estimates for the generalization error of these concave-convex profile estimates. Throughput profiles for data transfers over 10 Gbps connections with 0#x2013;366ms RTT provide important performance insights, including the near optimality of transfers performed with the XDD tool between XFS filesystems, and the performance limits of wide-area Lustre extensions using LNet routers. A direct application of generic machine learning packages does not adequately highlight these critical performance regions or provide as precise confidence estimates.

Original languageEnglish
Title of host publicationMachine Learning for Networking - 1st International Conference, MLN 2018, Revised Selected Papers
EditorsÉric Renault, Selma Boumerdassi, Paul Mühlethaler
PublisherSpringer Verlag
Pages1-22
Number of pages22
ISBN (Print)9783030199449
DOIs
StatePublished - 2019
Event1st International Conference on Machine Learning for Networking, MLN 2018 - Paris, France
Duration: Nov 27 2018Nov 29 2018

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume11407 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference1st International Conference on Machine Learning for Networking, MLN 2018
Country/TerritoryFrance
CityParis
Period11/27/1811/29/18

Funding

This work is funded by RAMSES project and Applied Mathematics program, Office of Advanced Computing Research, U.S. Department of Energy, and by Extreme Scale Systems Center, sponsored by U.S. Department of Defense, and performed at Oak Ridge National Laboratory managed by UT-Battelle, LLC for U.S. Department of Energy under Contract No. DE-AC05-00OR22725.

FundersFunder number
Office of Advanced Computing Research
UT-Battelle
U.S. Department of Defense
U.S. Department of Energy
Oak Ridge National Laboratory

    Keywords

    • Concavity-convexity
    • Data transport
    • Generalization bounds
    • Throughput profile

    Fingerprint

    Dive into the research topics of 'Learning concave-convex profiles of data transport over dedicated connections'. Together they form a unique fingerprint.

    Cite this