A system-aware optimized data organization for efficient scientific analytics

Yuan Tian, Scott Klasky, Weikuan Yu, Hasan Abbasi, Bin Wang, Norbert Podhorszki

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

2 Scopus citations

Abstract

Large-scale scientific applications on High End Computing systems produce a large volume of highly complex datasets. Such data imposes a grand challenge to conventional storage systems for the need of efficient I/O solutions during both the simulation runtime and data post-processing phases. With the mounting needs of scientific discovery, the read performance of large-scale simulations has becomes a critical issue for the HPC community. In this study, we propose a system-aware optimized data organization strategy that can organize data blocks of multidimensional scientific data efficiently based on simulation output and the underlying storage systems, thereby enabling efficient scientific analytics. Our experimental results demonstrate a performance speedup up to 72 times for the combustion simulation S3D, compared to the logically contiguous data layout.

Original languageEnglish
Title of host publicationHPDC '12 - Proceedings of the 21st ACM Symposium on High-Performance Parallel and Distributed Computing
Pages125-126
Number of pages2
DOIs
StatePublished - 2012
Event21st ACM Symposium on High-Performance Parallel and Distributed Computing, HPDC '12 - Delft, Netherlands
Duration: Jun 18 2012Jun 22 2012

Publication series

NameHPDC '12 - Proceedings of the 21st ACM Symposium on High-Performance Parallel and Distributed Computing

Conference

Conference21st ACM Symposium on High-Performance Parallel and Distributed Computing, HPDC '12
Country/TerritoryNetherlands
CityDelft
Period06/18/1206/22/12

Keywords

  • Data Layout
  • I/O

Fingerprint

Dive into the research topics of 'A system-aware optimized data organization for efficient scientific analytics'. Together they form a unique fingerprint.

Cite this