TY - GEN
T1 - Fast Algorithms for Scientific Data Compression
AU - Banerjee, Tania
AU - Lee, Jaemoon
AU - Choi, Jong
AU - Gong, Qian
AU - Chen, Jieyang
AU - Klasky, Scott
AU - Rangarajan, Anand
AU - Ranka, Sanjay
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - Many scientific simulations and experiments generate terabytes to petabytes of data daily, necessitating data compression techniques. Unlike video and image compression, scientists require methods that accurately preserve primary data (PD) and derived quantities of interest (QoIs). In our previous work, we demonstrated the effectiveness of hybrid compression techniques that combine machine learning with traditional approaches. This paper presents innovative computational techniques aimed at expediting the compression pipeline. Our experiments, conducted on two distinct platforms with a large-scale XGC-based fusion simulation, demonstrate that the overhead incurred by these new approaches is less than one percent of the computational resources needed for the simulation.
AB - Many scientific simulations and experiments generate terabytes to petabytes of data daily, necessitating data compression techniques. Unlike video and image compression, scientists require methods that accurately preserve primary data (PD) and derived quantities of interest (QoIs). In our previous work, we demonstrated the effectiveness of hybrid compression techniques that combine machine learning with traditional approaches. This paper presents innovative computational techniques aimed at expediting the compression pipeline. Our experiments, conducted on two distinct platforms with a large-scale XGC-based fusion simulation, demonstrate that the overhead incurred by these new approaches is less than one percent of the computational resources needed for the simulation.
KW - Data compression
KW - High-performance computing
KW - Machine learning
UR - http://www.scopus.com/inward/record.url?scp=85190594107&partnerID=8YFLogxK
U2 - 10.1109/HiPC58850.2023.00030
DO - 10.1109/HiPC58850.2023.00030
M3 - Conference contribution
AN - SCOPUS:85190594107
T3 - Proceedings - 2023 IEEE 30th International Conference on High Performance Computing, Data, and Analytics, HiPC 2023
SP - 143
EP - 152
BT - Proceedings - 2023 IEEE 30th International Conference on High Performance Computing, Data, and Analytics, HiPC 2023
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 30th Annual IEEE International Conference on High Performance Computing, Data, and Analytics, HiPC 2023
Y2 - 18 December 2023 through 21 December 2023
ER -