TY - GEN
T1 - Cross-geography scientific data transferring trends and behavior
AU - Liu, Zhengchun
AU - Foster, Ian
AU - Kettimuthu, Rajkumar
AU - Rao, Nageswara S.V.
N1 - Publisher Copyright:
© 2018 Copyright held by the owner/author(s).
PY - 2018/6/11
Y1 - 2018/6/11
N2 - Wide area data transfers play an important role in many science applications but rely on expensive infrastructure that often delivers disappointing performance in practice. In response, we present a systematic examination of a large set of data transfer log data to characterize transfer characteristics, including the nature of the datasets transferred, achieved throughput, user behavior, and resource usage. This analysis yields new insights that can help design better data transfer tools, optimize networking and edge resources used for transfers, and improve the performance and experience for end users. Our analysis shows that (i) most of the datasets as well as individual files transferred are very small; (ii) data corruption is not negligible for large data transfers; and (iii) the data transfer nodes utilization is low. Insights gained from our analysis suggest directions for further analysis.
AB - Wide area data transfers play an important role in many science applications but rely on expensive infrastructure that often delivers disappointing performance in practice. In response, we present a systematic examination of a large set of data transfer log data to characterize transfer characteristics, including the nature of the datasets transferred, achieved throughput, user behavior, and resource usage. This analysis yields new insights that can help design better data transfer tools, optimize networking and edge resources used for transfers, and improve the performance and experience for end users. Our analysis shows that (i) most of the datasets as well as individual files transferred are very small; (ii) data corruption is not negligible for large data transfers; and (iii) the data transfer nodes utilization is low. Insights gained from our analysis suggest directions for further analysis.
KW - File transfer
KW - GridFTP
KW - Usage management
KW - Wide area network
UR - http://www.scopus.com/inward/record.url?scp=85050097894&partnerID=8YFLogxK
U2 - 10.1145/3208040.3208053
DO - 10.1145/3208040.3208053
M3 - Conference contribution
AN - SCOPUS:85050097894
T3 - HPDC 2018 - Proceedings of the 2018 International Symposium on High-Performance Parallel and Distributed Computing
SP - 267
EP - 278
BT - HPDC 2018 - Proceedings of the 2018 International Symposium on High-Performance Parallel and Distributed Computing
PB - Association for Computing Machinery, Inc
T2 - 27th ACM International Symposium on High-Performance Parallel and Distributed Computing, HPDC 2018
Y2 - 11 June 2018 through 15 June 2018
ER -