Machine Learning Methods for Connection RTT and Loss Rate Estimation Using MPI Measurements Under Random Losses

Nageswara S.V. Rao, Neena Imam, Zhengchun Liu, Rajkumar Kettimuthu, Ian Foster

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Scientific computations are expected to be increasingly distributed across wide-area networks, and Message Passing Interface (MPI) has been shown to scale to support their communications over long distances. Application-level measurements of MPI operations reflect the connection Round-Trip Time (RTT) and loss rate, and machine learning methods have been previously developed to estimate them under deterministic periodic losses. In this paper, we consider more complex, random losses with uniform, Poisson and Gaussian distributions. We study five disparate machine leaning methods, with linear and non-linear, and smooth and non-smooth properties, to estimate RTT and loss rate over 10 Gbps connections with 0–366 ms RTT. The diversity and complexity of these estimators combined with the randomness of losses and TCP’s non-linear response together rule out the selection of a single best among them; instead, we fuse them to retain their design diversity. Overall, the results show that accurate estimates can be generated at low loss rates but become inaccurate at loss rates 10% and higher, thereby illustrating both their strengths and limitations.

Original languageEnglish
Title of host publicationMachine Learning for Networking - 2nd IFIP TC 6 International Conference, MLN 2019, Revised Selected Papers
EditorsSelma Boumerdassi, Éric Renault, Paul Mühlethaler
PublisherSpringer
Pages154-174
Number of pages21
ISBN (Print)9783030457778
DOIs
StatePublished - 2020
Event2nd International Conference on Machine Learning for Networking, MLN 2019 - Paris, France
Duration: Dec 3 2019Dec 5 2019

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume12081 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference2nd International Conference on Machine Learning for Networking, MLN 2019
Country/TerritoryFrance
CityParis
Period12/3/1912/5/19

Funding

This work is funded by RAMSES project and Applied Mathematics program, Office of Advanced Computing Research, U.S. Department of Energy, and by Extreme Scale Systems Center, sponsored by U.S. Department of Defense, and performed at Oak Ridge National Laboratory managed by UT-Battelle, LLC for U.S. Department of Energy under Contract No. DE-AC05-00OR22725.

FundersFunder number
Office of Advanced Computing Research
UT-Battelle
U.S. Department of Defense
U.S. Department of Energy
Oak Ridge National Laboratory
UT-BattelleDE-AC05-00OR22725

    Keywords

    • Generalization bounds
    • Information fusion
    • Loss rate
    • Machine Learning
    • Message Passing Interface
    • Regression
    • Round Trip Time

    Fingerprint

    Dive into the research topics of 'Machine Learning Methods for Connection RTT and Loss Rate Estimation Using MPI Measurements Under Random Losses'. Together they form a unique fingerprint.

    Cite this