Extending I/O through high performance data services

Hasan Abbasi, Jay Lofstead, Fang Zheng, Karsten Schwan, Karsten Wolf, Scott Klasky

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

39 Scopus citations

Abstract

The complexity of HPC systems has increased the burden on the developer as applications scale to hundreds of thousands of processing cores. Moreover, additional efforts are required to achieve acceptable I/O performance, where it is important how I/O is performed, which resources are used, and where I/O functionality is deployed. Specifically, by scheduling I/O data movement and by effectively placing operators affecting data volumes or information about the data, tremendous gains can be achieved both in the performance of simulation output and in the usability of output data. Previous studies have shown the value of using asynchronous I/O, of employing a staging area, and of performing select operations on data before it is written to disk. Leveraging such insights, this paper develops and experiments with higher level I/O abstractions, termed "data services", that manage output data from 'source to sink': where/when it is captured, transported towards storage, and filtered or manipulated by service functions to improve its information content. Useful services include data reduction, data indexing, and those that manage how I/O is performed, i.e., the control aspects of data movement. Our data services implementation distinguishes control aspects - the control plane - from data movement - the data plane, so that both may be changed separably. This results in runtime flexibility not only in which services to employ, but also in where to deploy them and how they use I/O resources. The outcome is consistently high levels of I/O performance at large scale, without requiring application change.

Original languageEnglish
Title of host publication2009 IEEE International Conference on Cluster Computing and Workshops, CLUSTER '09
DOIs
StatePublished - 2009
Event2009 IEEE International Conference on Cluster Computing and Workshops, CLUSTER '09 - New Orleans, LA, United States
Duration: Aug 31 2009Sep 4 2009

Publication series

NameProceedings - IEEE International Conference on Cluster Computing, ICCC
ISSN (Print)1552-5244

Conference

Conference2009 IEEE International Conference on Cluster Computing and Workshops, CLUSTER '09
Country/TerritoryUnited States
CityNew Orleans, LA
Period08/31/0909/4/09

Fingerprint

Dive into the research topics of 'Extending I/O through high performance data services'. Together they form a unique fingerprint.

Cite this