Usage Pattern-Driven Dynamic Data Layout Reorganization

Houjun Tang, Suren Byna, Steve Harenberg, Xiaocheng Zou, Wenzhao Zhang, Kesheng Wu, Bin Dong, Oliver Rubel, Kristofer Bouchard, Scott Klasky, Nagiza F. Samatova

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

13 Scopus citations

Abstract

As scientific simulations and experiments move toward extremely large scales and generate massive amounts of data, the data access performance of analytic applications becomes crucial. A mismatch often happens between write and read patterns of data accesses, typically resulting in poor read performance. Data layout reorganization has been used to improve the locality of data accesses. However, current data reorganizations are static and focus on generating a single (or set of) optimized layouts that rely on prior knowledge of exact future access patterns. We propose a framework that dynamically recognizes the data usage patterns, replicates the data of interest in multiple reorganized layouts that would benefit common read patterns, and makes runtime decisions on selecting a favorable layout for a given read pattern. This framework supports reading individual elements and chunks of a multi-dimensional array of variables. Our pattern-driven layout selection strategy achieves multi-fold speedups compared to reading from the original dataset.

Original languageEnglish
Title of host publicationProceedings - 2016 16th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing, CCGrid 2016
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages356-365
Number of pages10
ISBN (Electronic)9781509024520
DOIs
StatePublished - Jul 18 2016
Event16th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing, CCGrid 2016 - Cartagena, Colombia
Duration: May 16 2016May 19 2016

Publication series

NameProceedings - 2016 16th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing, CCGrid 2016

Conference

Conference16th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing, CCGrid 2016
Country/TerritoryColombia
CityCartagena
Period05/16/1605/19/16

Funding

This work is supported in part by the U.S. Department of Energy, Office of Science, Advanced Scientific Computing Research under contracts DE-AC02-05CH11231 at Lawrence Berkeley National Laboratory and DE-AC05-00OR22725 at Oak Ridge National Laboratory, and by the U.S. National Science Foundation (Expeditions in Computing and EAGER program). This research used resources from the National Energy Research Scientific Computing Center and Oak Ridge Leadership Computing Facility.

FundersFunder number
Oak Ridge National Laboratory
National Science Foundation
U.S. Department of Energy
Office of Science
Advanced Scientific Computing ResearchDE-AC05-00OR22725, DE-AC02-05CH11231
Oak Ridge National Laboratory

    Keywords

    • data access performance
    • data layout reorganization
    • data usage pattern

    Fingerprint

    Dive into the research topics of 'Usage Pattern-Driven Dynamic Data Layout Reorganization'. Together they form a unique fingerprint.

    Cite this