TY - GEN
T1 - Layout-aware I/O Scheduling for terabits data movement
AU - Kim, Youngjae
AU - Atchley, Scott
AU - Vallee, Geoffroy R.
AU - Shipman, Galen M.
PY - 2013
Y1 - 2013
N2 - Many science facilities, such as the Department of Energy's Leadership Computing Facilities and experimental facilities including the Spallation Neutron Source, Stanford Linear Accelerator Center, and Advanced Photon Source, produce massive amounts of experimental and simulation data. These data are often shared among the facilities and with collaborating institutions. Moving large datasets over the wide-area network (WAN) is a major problem inhibiting collaboration. Next-generation, terabit-networks will help alleviate the problem, however, the parallel storage systems on the endsystem hosts at these institutions can become a bottleneck for terabit data movement. The parallel storage system (PFS) is shared by simulation systems, experimental systems, analysis and visualization clusters, in addition to wide-area data movers. These competing uses often induce temporary, but significant, I/O load imbalances on the storage system, which impact the performance of all the users. The problem is a serious concern because some resources are more expensive (e.g. super computers) or have time-critical deadlines (e.g. experimental data from a light source), but parallel file systems handle all requests fairly even if some storage servers are under heavy load. This paper investigates the problem of competing workloads accessing the parallel file system and how the performance of wide-area data movement can be improved in these environments. First, we study the I/O load imbalance problems using actual I/O performance data collected from the Spider storage system at the Oak Ridge Leadership Computing Facility. Second, we present I/O optimization solutions with layout-awareness on end-system hosts for bulk data movement. With our evaluation, we show that our I/O optimization techniques can avoid the I/O congested disk groups, improving storage I/O times on parallel storage systems for terabit data movement.
AB - Many science facilities, such as the Department of Energy's Leadership Computing Facilities and experimental facilities including the Spallation Neutron Source, Stanford Linear Accelerator Center, and Advanced Photon Source, produce massive amounts of experimental and simulation data. These data are often shared among the facilities and with collaborating institutions. Moving large datasets over the wide-area network (WAN) is a major problem inhibiting collaboration. Next-generation, terabit-networks will help alleviate the problem, however, the parallel storage systems on the endsystem hosts at these institutions can become a bottleneck for terabit data movement. The parallel storage system (PFS) is shared by simulation systems, experimental systems, analysis and visualization clusters, in addition to wide-area data movers. These competing uses often induce temporary, but significant, I/O load imbalances on the storage system, which impact the performance of all the users. The problem is a serious concern because some resources are more expensive (e.g. super computers) or have time-critical deadlines (e.g. experimental data from a light source), but parallel file systems handle all requests fairly even if some storage servers are under heavy load. This paper investigates the problem of competing workloads accessing the parallel file system and how the performance of wide-area data movement can be improved in these environments. First, we study the I/O load imbalance problems using actual I/O performance data collected from the Spider storage system at the Oak Ridge Leadership Computing Facility. Second, we present I/O optimization solutions with layout-awareness on end-system hosts for bulk data movement. With our evaluation, we show that our I/O optimization techniques can avoid the I/O congested disk groups, improving storage I/O times on parallel storage systems for terabit data movement.
KW - I/O Scheduling
KW - Networking
KW - Storage Systems
UR - http://www.scopus.com/inward/record.url?scp=84893234053&partnerID=8YFLogxK
U2 - 10.1109/BigData.2013.6691661
DO - 10.1109/BigData.2013.6691661
M3 - Conference contribution
AN - SCOPUS:84893234053
SN - 9781479912926
T3 - Proceedings - 2013 IEEE International Conference on Big Data, Big Data 2013
SP - 44
EP - 51
BT - Proceedings - 2013 IEEE International Conference on Big Data, Big Data 2013
PB - IEEE Computer Society
T2 - 2013 IEEE International Conference on Big Data, Big Data 2013
Y2 - 6 October 2013 through 9 October 2013
ER -