Cross-geography scientific data transferring trends and behavior

Zhengchun Liu, Ian Foster, Rajkumar Kettimuthu, Nageswara S.V. Rao

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

22 Scopus citations

Abstract

Wide area data transfers play an important role in many science applications but rely on expensive infrastructure that often delivers disappointing performance in practice. In response, we present a systematic examination of a large set of data transfer log data to characterize transfer characteristics, including the nature of the datasets transferred, achieved throughput, user behavior, and resource usage. This analysis yields new insights that can help design better data transfer tools, optimize networking and edge resources used for transfers, and improve the performance and experience for end users. Our analysis shows that (i) most of the datasets as well as individual files transferred are very small; (ii) data corruption is not negligible for large data transfers; and (iii) the data transfer nodes utilization is low. Insights gained from our analysis suggest directions for further analysis.

Original languageEnglish
Title of host publicationHPDC 2018 - Proceedings of the 2018 International Symposium on High-Performance Parallel and Distributed Computing
PublisherAssociation for Computing Machinery, Inc
Pages267-278
Number of pages12
ISBN (Electronic)9781450357852
DOIs
StatePublished - Jun 11 2018
Event27th ACM International Symposium on High-Performance Parallel and Distributed Computing, HPDC 2018 - Tempe, United States
Duration: Jun 11 2018Jun 15 2018

Publication series

NameHPDC 2018 - Proceedings of the 2018 International Symposium on High-Performance Parallel and Distributed Computing

Conference

Conference27th ACM International Symposium on High-Performance Parallel and Distributed Computing, HPDC 2018
Country/TerritoryUnited States
CityTempe
Period06/11/1806/15/18

Funding

This material is based upon work supported by the U.S. Department of Energy, Office of Science, under contract number DE-AC02-06CH11357. We gratefully acknowledge the computing resources provided and operated by the Joint Laboratory for System Evaluation (JLSE) at Argonne National Laboratory. We also would like to thank the five anonymous reviewers for their helpful comments.

FundersFunder number
U.S. Department of Energy
Office of ScienceDE-AC02-06CH11357
Argonne National Laboratory

    Keywords

    • File transfer
    • GridFTP
    • Usage management
    • Wide area network

    Fingerprint

    Dive into the research topics of 'Cross-geography scientific data transferring trends and behavior'. Together they form a unique fingerprint.

    Cite this