Abstract
HPC systems typically rely on the fixed-lifetime (FLT) data retention strategy, which only considers temporal locality of data accesses to parallel file systems. However, our extensive analysis based on the leadership-class HPC system traces suggests that the FLT approach often fails to capture the dynamics in users behavior and leads to undesired data purge. In this study, we propose an activeness-based data retention (ActiveDR) solution, which advocates considering the data retention approach from a holistic activeness-based perspective. By evaluating the frequency and impact of users activities, ActiveDR prioritizes the file purge process for inactive users and rewards active users with extended file lifetime on parallel storage. Our extensive evaluations based on the traces of the prior Titan supercomputer show that, when reaching the same purge target, ActiveDR achieves up to 37% file miss reduction as compared to the current FLT retention methodology.
Original language | English |
---|---|
Title of host publication | Proceedings of SC 2021 |
Subtitle of host publication | The International Conference for High Performance Computing, Networking, Storage and Analysis: Science and Beyond |
Publisher | IEEE Computer Society |
ISBN (Electronic) | 9781450384421 |
DOIs | |
State | Published - Nov 14 2021 |
Event | 33rd International Conference for High Performance Computing, Networking, Storage and Analysis: Science and Beyond, SC 2021 - Virtual, Online, United States Duration: Nov 14 2021 → Nov 19 2021 |
Publication series
Name | International Conference for High Performance Computing, Networking, Storage and Analysis, SC |
---|---|
ISSN (Print) | 2167-4329 |
ISSN (Electronic) | 2167-4337 |
Conference
Conference | 33rd International Conference for High Performance Computing, Networking, Storage and Analysis: Science and Beyond, SC 2021 |
---|---|
Country/Territory | United States |
City | Virtual, Online |
Period | 11/14/21 → 11/19/21 |
Funding
We are thankful to the anonymous reviewers for their valuable feedback. This research is supported in part by the National Science Foundation under grant CCF-1718336, OAC-1835892 and CNS-1817094. This manuscript has been authored by an author at Lawrence Berkeley National Laboratory under Contract No. DE-AC02-05CH11231 with the U.S. Department of Energy, and has been authored by UT-Battelle, LLC, under contract DE-AC05-00OR22725 with the U.S. Department of Energy (DOE). The U.S. Government retains, and the publisher, by accepting the article for publication, acknowledges that the U.S. Government retains a non-exclusive, paid-up, irrevocable, world-wide license to publish or reproduce the published form of this manuscript, or allow others to do so, for U.S. Government purposes. DOE will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan (http://energy.gov/downloads/doe-public-access-plan).
Keywords
- Data management
- Data retention
- Purge policy
- Storage resource management
- Storage tiering
- User behavior