Compressing the incompressible with ISABELA: In-situ reduction of spatio-temporal data

Sriram Lakshminarasimhan, Neil Shah, Stephane Ethier, Scott Klasky, Rob Latham, Rob Ross, Nagiza F. Samatova

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

144 Scopus citations

Abstract

Modern large-scale scientific simulations running on HPC systems generate data in the order of terabytes during a single run. To lessen the I/O load during a simulation run, scientists are forced to capture data infrequently, thereby making data collection an inherently lossy process. Yet, lossless compression techniques are hardly suitable for scientific data due to its inherently random nature; for the applications used here, they offer less than 10% compression rate. They also impose significant overhead during decompression, making them unsuitable for data analysis and visualization that require repeated data access. To address this problem, we propose an effective method for In-situ Sort-And-B-spline Error-bounded Lossy Abatement (ISABELA) of scientific data that is widely regarded as effectively incompressible. With ISABELA, we apply a preconditioner to seemingly random and noisy data along spatial resolution to achieve an accurate fitting model that guarantees a ≥0.99 correlation with the original data. We further take advantage of temporal patterns in scientific data to compress data by ≈ 85%, while introducing only a negligible overhead on simulations in terms of runtime. ISABELA significantly outperforms existing lossy compression methods, such as Wavelet compression. Moreover, besides being a communication-free and scalable compression technique, ISABELA is an inherently local decompression method, namely it does not decode the entire data, making it attractive for random access.

Original languageEnglish
Title of host publicationEuro-Par 2011 Parallel Processing - 17th International Conference, Proceedings
Pages366-379
Number of pages14
EditionPART 1
DOIs
StatePublished - 2011
Event17th International Conference on Parallel Processing, Euro-Par 2011 - Bordeaux, France
Duration: Aug 29 2011Sep 2 2011

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
NumberPART 1
Volume6852 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference17th International Conference on Parallel Processing, Euro-Par 2011
Country/TerritoryFrance
CityBordeaux
Period08/29/1109/2/11

Keywords

  • B-spline
  • Data-intensive Application
  • High Performance Computing
  • In-situ Processing
  • Lossy Compression

Fingerprint

Dive into the research topics of 'Compressing the incompressible with ISABELA: In-situ reduction of spatio-temporal data'. Together they form a unique fingerprint.

Cite this