TY - GEN
T1 - Throughput Analytics of Data Transfer Infrastructures
AU - Rao, Nageswara S.V.
AU - Liu, Qiang
AU - Liu, Zhengchun
AU - Kettimuthu, Rajkumar
AU - Foster, Ian
N1 - Publisher Copyright:
© 2019, ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering.
PY - 2019
Y1 - 2019
N2 - To support increasingly distributed scientific and big-data applications, powerful data transfer infrastructures are being built with dedicated networks and software frameworks customized to distributed file systems and data transfer nodes. The data transfer performance of such infrastructures critically depends on the combined choices of file, disk, and host systems as well as network protocols and file transfer software, all of which may vary across sites. The randomness of throughput measurements makes it challenging to assess the impact of these choices on the performance of infrastructure or its parts. We propose regression-based throughput profiles by aggregating measurements from sites of the infrastructure, with RTT as the independent variable. The peak values and convex-concave shape of a profile together determine the overall throughput performance of memory and file transfers, and its variations show the performance differences among the sites. We then present projection and difference operators, and coefficients of throughput profiles to characterize the performance of infrastructure and its parts, including sites and file transfer tools. In particular, the utilization-concavity coefficient provides a value in the range [0, 1] that reflects overall transfer effectiveness. We present results of measurements collected using (i) testbed experiments over dedicated 0–366 ms 10 Gbps connections with combinations of TCP versions, file systems, host systems and transfer tools, and (ii) Globus GridFTP transfers over production infrastructure with varying site configurations.
AB - To support increasingly distributed scientific and big-data applications, powerful data transfer infrastructures are being built with dedicated networks and software frameworks customized to distributed file systems and data transfer nodes. The data transfer performance of such infrastructures critically depends on the combined choices of file, disk, and host systems as well as network protocols and file transfer software, all of which may vary across sites. The randomness of throughput measurements makes it challenging to assess the impact of these choices on the performance of infrastructure or its parts. We propose regression-based throughput profiles by aggregating measurements from sites of the infrastructure, with RTT as the independent variable. The peak values and convex-concave shape of a profile together determine the overall throughput performance of memory and file transfers, and its variations show the performance differences among the sites. We then present projection and difference operators, and coefficients of throughput profiles to characterize the performance of infrastructure and its parts, including sites and file transfer tools. In particular, the utilization-concavity coefficient provides a value in the range [0, 1] that reflects overall transfer effectiveness. We present results of measurements collected using (i) testbed experiments over dedicated 0–366 ms 10 Gbps connections with combinations of TCP versions, file systems, host systems and transfer tools, and (ii) Globus GridFTP transfers over production infrastructure with varying site configurations.
KW - Data transfer
KW - Infrastructure
KW - Throughput profile
UR - http://www.scopus.com/inward/record.url?scp=85063038780&partnerID=8YFLogxK
U2 - 10.1007/978-3-030-12971-2_2
DO - 10.1007/978-3-030-12971-2_2
M3 - Conference contribution
AN - SCOPUS:85063038780
SN - 9783030129705
T3 - Lecture Notes of the Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering, LNICST
SP - 20
EP - 40
BT - Testbeds and Research Infrastructures for the Development of Networks and Communities - 13th EAI International Conference, TridentCom 2018, Proceedings
A2 - Gao, Honghao
A2 - Miao, Huaikou
A2 - Yang, Xiaoxian
A2 - Yin, Yuyu
PB - Springer Verlag
T2 - 13th EAI International Conference on Testbeds and Research Infrastructures for the Development of Networks and Communities, TridentCom 2018
Y2 - 1 December 2018 through 3 December 2018
ER -