Online and Scalable Data Compression Pipeline with Guarantees on Quantities of Interest

Tania Banerjee, Jaemoon Lee, Jong Choi, Qian Gong, Jieyang Chen, Choongseok Chang, Scott Klasky, Anand Rangarajan, Sanjay Ranka

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Scopus citations

Abstract

Data compression is becoming critical for data-intensive scientific applications. Scientists require compression techniques that accurately preserve derived quantities of interest (QoIs). Prior work has shown that a pipeline can be built to guarantee error on the primary data (PD) within user-defined bounds and achieve near-floating point QoI errors. In this paper, we present novel computational approaches for accelerating the pipeline and demonstrate results that enable concurrent execution of compression in parallel with the simulation nodes. This allows compression, including the writing of the required compression data, for the previous time step to be completed while the simulation proceeds with the current time step. Overall, the approach presented in this paper results in a 6-8 times improvement in computational overhead compared to previous work. These results were obtained using data generated by a large-scale fusion code called XGC, which produces hundreds of terabytes of data in a single day.

Original languageEnglish
Title of host publicationProceedings 2023 IEEE 19th International Conference on e-Science, e-Science 2023
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9798350322231
DOIs
StatePublished - 2023
Event19th IEEE International Conference on e-Science, e-Science 2023 - Limassol, Cyprus
Duration: Oct 9 2023Oct 14 2023

Publication series

NameProceedings 2023 IEEE 19th International Conference on e-Science, e-Science 2023

Conference

Conference19th IEEE International Conference on e-Science, e-Science 2023
Country/TerritoryCyprus
CityLimassol
Period10/9/2310/14/23

Fingerprint

Dive into the research topics of 'Online and Scalable Data Compression Pipeline with Guarantees on Quantities of Interest'. Together they form a unique fingerprint.

Cite this