A General Framework for Error-controlled Unstructured Scientific Data Compression

Qian Gong, Zhe Wang, Viktor Reshniak, Xin Liang, Jieyang Chen, Qing Liu, Tushar M. Athawale, Yi Ju, Anand Rangarajan, Sanjay Ranka, Norbert Podhorszki, Rick Archibald, Scott Klasky

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Data compression plays a key role in reducing storage and I/O costs. Traditional lossy methods primarily target data on rectilinear grids and cannot leverage the spatial coherence in unstructured mesh data, leading to suboptimal compression ratios. We present a multi-component, error-bounded compression framework designed to enhance the compression of floating-point unstructured mesh data, which is common in scientific applications. Our approach involves interpolating mesh data onto a rectilinear grid and then separately compressing the grid interpolation and the interpolation residuals. This method is general, independent of mesh types and typologies, and can be seamlessly integrated with existing lossy compressors for improved performance. We evaluated our framework across twelve variables from two synthetic datasets and two real-world simulation datasets. The results indicate that the multi-component framework consistently outperforms state-of-the-art lossy compressors on unstructured data, achieving, on average, a 2.3 - 3.5× improvement in compression ratios, with error bounds ranging from 1 × 10 the -6 to 1×10-2. We further investigate impact of hyperparameters, such as grid spacing and error allocation, to deliver optimal compression ratios in diverse datasets.

Original languageEnglish
Title of host publicationProceedings - 2024 IEEE 20th International Conference on e-Science, e-Science 2024
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9798350365610
DOIs
StatePublished - 2024
Event20th IEEE International Conference on e-Science, e-Science 2024 - Osaka, Japan
Duration: Sep 16 2024Sep 20 2024

Publication series

NameProceedings - 2024 IEEE 20th International Conference on e-Science, e-Science 2024

Conference

Conference20th IEEE International Conference on e-Science, e-Science 2024
Country/TerritoryJapan
CityOsaka
Period09/16/2409/20/24

Funding

This research was supported by the SIRIUS-2 ASCR research project, the Scientific Discovery through Advanced Computing (SciDAC) program, specifically the RAPIDS-2 SciDAC institute, and the GE-ORNL CRADA data reductoin project. This research used resources of the Oak Ridge Leadership Computing Facility, which is a DOE Office of Science User Facility.

FundersFunder number
Advanced Scientific Computing Research
Office of Science

    Keywords

    • error-control
    • multi-components
    • unstructured data compression

    Fingerprint

    Dive into the research topics of 'A General Framework for Error-controlled Unstructured Scientific Data Compression'. Together they form a unique fingerprint.

    Cite this