TY - GEN
T1 - Using Working Set Reorganization to Manage Storage Systems with Hard and Solid State Disks
AU - Chen, Junjie
AU - Liu, Jialin
AU - Roth, Philip
AU - Chen, Yong
N1 - Publisher Copyright:
© 2014 IEEE.
PY - 2015/5/7
Y1 - 2015/5/7
N2 - Scientific applications from many problem domains produce and/or access large volumes of data. To support these applications, designers of high-end computing (HEC) systems have greatly increased the capacity of storage systems in recent years. However, because hard disk drives (HDDs) are still the dominant storage device used in HEC storage systems, and because HDD performance has not improved as quickly as the capacity, it can be challenging to deploy a storage system that provides both extreme capacity and extreme performance at a reasonable cost. Solid State Drives (SSDs) are a promising high-bandwidth and low-latency alternative to HDDs for HEC storage systems, but they too have deficiencies: small capacity, limited write cycles, and high cost when compared to HDDs. Because of their complementary characteristics, storage system designers are beginning to consider heterogeneous storage system designs that include both HDDs and SSDs. However, managing the workload so as to take advantage of the strengths of each type of storage device while controlling overhead is a major challenge. In this study, we propose a novel approach for managing a heterogeneous storage system called the Working Set-based Reorganization Scheme (WS-ROS). With WS-ROS, applications write to both HDDs and SSDs using all the available storage system bandwidth. Later, a background process reorganizes the data so as to place the data most likely to be read on SSDs while relegating the data most likely to be written and the data not likely to be accessed onto the slower but higher-capacity HDDs. For our evaluation workloads, the WS-ROS approach provided a 3× to 10× performance improvement compared to a heterogeneous storage system without a working set-based data reorganization scheme, suggesting the value of lazy reorganization of data based on data access working sets.
AB - Scientific applications from many problem domains produce and/or access large volumes of data. To support these applications, designers of high-end computing (HEC) systems have greatly increased the capacity of storage systems in recent years. However, because hard disk drives (HDDs) are still the dominant storage device used in HEC storage systems, and because HDD performance has not improved as quickly as the capacity, it can be challenging to deploy a storage system that provides both extreme capacity and extreme performance at a reasonable cost. Solid State Drives (SSDs) are a promising high-bandwidth and low-latency alternative to HDDs for HEC storage systems, but they too have deficiencies: small capacity, limited write cycles, and high cost when compared to HDDs. Because of their complementary characteristics, storage system designers are beginning to consider heterogeneous storage system designs that include both HDDs and SSDs. However, managing the workload so as to take advantage of the strengths of each type of storage device while controlling overhead is a major challenge. In this study, we propose a novel approach for managing a heterogeneous storage system called the Working Set-based Reorganization Scheme (WS-ROS). With WS-ROS, applications write to both HDDs and SSDs using all the available storage system bandwidth. Later, a background process reorganizes the data so as to place the data most likely to be read on SSDs while relegating the data most likely to be written and the data not likely to be accessed onto the slower but higher-capacity HDDs. For our evaluation workloads, the WS-ROS approach provided a 3× to 10× performance improvement compared to a heterogeneous storage system without a working set-based data reorganization scheme, suggesting the value of lazy reorganization of data based on data access working sets.
KW - High-end computing
KW - Solid State Drives
KW - data-intensive computing
KW - parallel file systems
KW - storage
UR - http://www.scopus.com/inward/record.url?scp=84946550514&partnerID=8YFLogxK
U2 - 10.1109/ICPPW.2014.45
DO - 10.1109/ICPPW.2014.45
M3 - Conference contribution
AN - SCOPUS:84946550514
T3 - Proceedings of the International Conference on Parallel Processing Workshops
SP - 283
EP - 291
BT - Proceedings - 43rd International Conference on Parallel Processing Workshops, ICPPW 2014
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 43rd International Conference on Parallel Processing Workshops, ICPPW 2014
Y2 - 9 September 2014 through 12 September 2014
ER -