Machine Learning Techniques for Data Reduction of Climate Applications

Xiao Li, Qian Gong, Jaemoon Lee, Scott Klasky, Anand Rangarajan, Sanjay Ranka

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Scientists conduct large-scale simulations to compute derived quantities-of-interest (QoI) from primary data. Often, QoI are linked to specific features, regions, or time intervals, such that data can be adaptively reduced without compromising the integrity of QoI. For many spatiotemporal applications, these QoI are binary in nature and represent presence or absence of a physical phenomenon. We present a pipelined compression approach that first uses neural-network-based techniques to derive regions where QoI are highly likely to be present. Then, we employ a Guaranteed Autoencoder (GAE) to compress data with differential error bounds. GAE uses QoI information to apply low-error compression to only these regions. This results in overall high compression ratios while still achieving downstream goals of simulation or data collections. Experimental results are presented for climate data generated from the E3SM Simulation model for downstream quantities such as tropical cyclone and atmospheric river detection and tracking. These results show that our approach is superior to comparable methods in the literature.

Original languageEnglish
Title of host publicationAdvances in Knowledge Discovery and Data Mining - 29th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2025, Proceedings
EditorsXintao Wu, Myra Spiliopoulou, Can Wang, Vipin Kumar, Longbing Cao, Yanqiu Wu, Zhangkai Wu, Yu Yao
PublisherSpringer Science and Business Media Deutschland GmbH
Pages437-448
Number of pages12
ISBN (Print)9789819681693
DOIs
StatePublished - 2025
Event29th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2025 - Sydney, Australia
Duration: Jun 10 2025Jun 13 2025

Publication series

NameLecture Notes in Computer Science
Volume15870 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference29th Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2025
Country/TerritoryAustralia
CitySydney
Period06/10/2506/13/25

Funding

This work was partially supported by DOE RAPIDS2 DE-SC0021320 and DOE DE-SC0022265.

Keywords

  • Climate Application
  • Machine Learning
  • Quantities of Interest
  • Region-Adaptive Data Compression

Fingerprint

Dive into the research topics of 'Machine Learning Techniques for Data Reduction of Climate Applications'. Together they form a unique fingerprint.

Cite this