Runtime I/O re-routing + throttling on HPC storage

Research output: Contribution to conferencePaperpeer-review

23 Scopus citations

Abstract

Massively parallel storage systems are becoming more and more prevalent on HPC systems due to the emergence of a new generation of data-intensive applications. To achieve the level of I/O throughput and capacity that is demanded by data intensive applications, storage systems typically deploy a large number of storage devices (also known as LUNs or data stores). In doing so, parallel applications are allowed to access storage concurrently, and as a result, the aggregate I/O throughput can be linearly increased with the number of storage devices, reducing the application’s end-to-end time. For a production system where storage devices are shared between multiple applications, contention is often a major problem leading to a significant reduction in I/O throughput. In this paper, we describe our efforts to resolve this issue in the context of HPC using a balanced re-routing + throttling approach. The proposed scheme re-routes I/O requests to a less congested storage location in a controlled manner so that write performance is improved while limiting the impact on read.

Original languageEnglish
StatePublished - 2013
Event5th USENIX Workshop on Hot Topics in Storage and File Systems, HotStorage 2013 - San Jose, United States
Duration: Jun 27 2013Jun 28 2013

Conference

Conference5th USENIX Workshop on Hot Topics in Storage and File Systems, HotStorage 2013
Country/TerritoryUnited States
CitySan Jose
Period06/27/1306/28/13

Funding

The authors would like to thank our shepherd Nohhyun Park from Cloud Physics as well as anonymous reviewers for the valuable suggestions and the Department of Energy Office of Science for the sponsorship.

Fingerprint

Dive into the research topics of 'Runtime I/O re-routing + throttling on HPC storage'. Together they form a unique fingerprint.

Cite this