Abstract
Scientific applications continue to grow and produce extremely large amounts of data, which require efficient compression algorithms for long-term storage. Compression errors in scientific applications can have a deleterious impact on downstream processing. Thus, it is crucial to preserve all the “known” Quantities of Interest (QoI) during compression. To address this issue, most existing approaches guarantee the reconstruction error of the original data or primary data (PD), but cannot directly control the problem of preserving the QoI. In this work, we propose a physics-informed compression technique that is composed of two parts: (i) reduction of the PD with bounded errors and (ii) preservation of the QoI. In the first step, we combine tensor decompositions, autoencoders, product quantizers, and error-bounded lossy compressors to bound the reconstruction error at high levels of compression. In the second step, we use constraint satisfaction post-processing followed by quantization to preserve the QoI. To illustrate the challenges of reducing the reconstruction errors of the PD and QoI, we focus on simulation data generated by a large-scale fusion code, XGC, which can produce tens of petabytes in a single day. The results show that our approach can achieve a high compression amount while accurately preserving the QoI within scientifically acceptable bounds.
Original language | English |
---|---|
Article number | 6718 |
Journal | Applied Sciences (Switzerland) |
Volume | 12 |
Issue number | 13 |
DOIs | |
State | Published - Jul 1 2022 |
Funding
Funding: This research was partially supported by DOE DE-SC0022265 and DOE DE-SC0021320 RAPIDS2. Acknowledgments: The authors acknowledge the DOE (Grant No. DE-SC0022265) and DOE RAPIDS2 (Grant No. DE-SC0021320) for funding this project.
Funders | Funder number |
---|---|
DOE RAPIDS2 | DE-SC0021320 |
U.S. Department of Energy | DE-SC0022265, DE-SC0021320 RAPIDS2 |
Keywords
- autoencoders
- constraint satisfaction
- data compression
- error guarantees
- fusion application
- moment preservation
- quantization