Abstract
The rise of containers has led to a broad proliferation of container images. The associated storage performance and capacity requirements place high pressure on the infrastructure of container registries that store and serve images. Exploiting the high file redundancy in real-world container images is a promising approach to drastically reduce the demanding storage requirements of the growing registries. However, existing deduplication techniques significantly degrade the performance of registries because of the high layer restore overhead. We propose DupHunter, a new Docker registry architecture, which not only natively deduplicates layers for space savings but also reduces layer restore overhead. DupHunter supports several configurable deduplication modes, which provide different levels of storage efficiency, durability, and performance, to support a range of uses. To mitigate the negative impact of deduplication on the image download times, DupHunter introduces a two-tier storage hierarchy with a novel layer prefetch/preconstruct cache algorithm based on user access patterns. Under real workloads, in the highest data reduction mode, DupHunter reduces storage space by up to 6.9× compared to the current implementations. In the highest performance mode, DupHunter can reduce the GET layer latency up to 2.8× compared to the state of the art.
Original language | English |
---|---|
Title of host publication | Proceedings of the 2020 USENIX Annual Technical Conference, ATC 2020 |
Publisher | USENIX Association |
Pages | 769-783 |
Number of pages | 15 |
ISBN (Electronic) | 9781939133144 |
State | Published - 2020 |
Externally published | Yes |
Event | 2020 USENIX Annual Technical Conference, ATC 2020 - Virtual, Online Duration: Jul 15 2020 → Jul 17 2020 |
Publication series
Name | Proceedings of the 2020 USENIX Annual Technical Conference, ATC 2020 |
---|
Conference
Conference | 2020 USENIX Annual Technical Conference, ATC 2020 |
---|---|
City | Virtual, Online |
Period | 07/15/20 → 07/17/20 |
Funding
We are thankful to the anonymous reviewers and our shepherd Abhinav Duggal for their valuable feedback. This work is sponsored in part by the National Science Foundation under grants CCF-1919113, CNS-1405697, CNS-1615411, and OAC-2004751.