Abstract
Scientific applications in fields such as high energy physics, computational fluid dynamics, and climate science generate vast amounts of data at high velocities. This exponential growth in data production is surpassing the advancements in computing power, network capabilities, and storage capacities. To address this challenge, data compression or reduction techniques are crucial. These scientific datasets have underlying data structures that consist of structured and block structured multidimensional meshes where each grid point corresponds to a tensor. It is important that data reduction techniques leverage strong spatial and temporal correlations that are ubiquitous in these applications. Additionally, applications such as CFD, process tensors comprising hundred plus species and their attributes at each grid point. Reduction techniques should be able to leverage interrelationships between the elements in each tensor.In this paper, we propose an attention-based hierarchical compression method utilizing a block-wise compression setup. We introduce an attention-based hyper-block autoencoder to capture inter-block correlations, followed by a block-wise encoder to capture block-specific information. A PCA-based post-processing step is employed to guarantee error bounds for each data block. Our method effectively captures both spatiotemporal and inter-variable correlations within and between data blocks. Compared to the state-of-the-art SZ3, our method achieves up to 8× higher compression ratio on the multi-variable S3D dataset. When evaluated on single-variable setups using the E3SM and XGC datasets, our method still achieves up to 3× and 2× higher compression ratio, respectively.
| Original language | English |
|---|---|
| Title of host publication | Proceedings - 2024 IEEE International Conference on Big Data, BigData 2024 |
| Editors | Wei Ding, Chang-Tien Lu, Fusheng Wang, Liping Di, Kesheng Wu, Jun Huan, Raghu Nambiar, Jundong Li, Filip Ilievski, Ricardo Baeza-Yates, Xiaohua Hu |
| Publisher | Institute of Electrical and Electronics Engineers Inc. |
| Pages | 1039-1048 |
| Number of pages | 10 |
| ISBN (Electronic) | 9798350362480 |
| DOIs | |
| State | Published - 2024 |
| Externally published | Yes |
| Event | 2024 IEEE International Conference on Big Data, BigData 2024 - Washington, United States Duration: Dec 15 2024 → Dec 18 2024 |
Publication series
| Name | Proceedings - 2024 IEEE International Conference on Big Data, BigData 2024 |
|---|
Conference
| Conference | 2024 IEEE International Conference on Big Data, BigData 2024 |
|---|---|
| Country/Territory | United States |
| City | Washington |
| Period | 12/15/24 → 12/18/24 |
Funding
This work was partially supported by DOE RAPIDS2 DE-SC0021320 and DOE DE-SC0022265.