STREAMING COMPRESSION OF SCIENTIFIC DATA VIA WEAK-SINDY

Benjamin P. Russo, M. Paul Laiu, Richard Archibald

Research output: Contribution to journalArticlepeer-review

Abstract

In this paper a streaming weak-SINDy algorithm is developed specifically for compressing streaming scientific data. The production of scientific data, either via simulation or experiments, is undergoing a stage of exponential growth, which makes data compression important and often necessary for storing and utilizing large scientific data sets. As opposed to classical ``offline"" compression algorithms that perform compression on a readily available data set, streaming compression algorithms compress data ``online"" while the data generated from simulation or experiments is still flowing through the system. This feature makes streaming compression algorithms well suited for scientific data compression, where storing the full data set offline is often infeasible. This work proposes a new streaming compression algorithm, streaming weak-SINDy, which takes advantage of the underlying data characteristics during compression. The streaming weak-SINDy algorithm constructs feature matrices and target vectors in the online stage via a streaming integration method in a memory efficient manner. The feature matrices and target vectors are then used in the offline stage to build a model through a regression process that aims to recover equations that govern the evolution of the data. For compressing high-dimensional streaming data, we adopt a streaming proper orthogonal decomposition (POD) process to reduce the data dimension and then use the streaming weak-SINDy algorithm to compress the temporal data of the POD expansion. We propose modifications to the streaming weak-SINDy algorithm to accommodate the dynamically updated POD basis. By combining the built model from the streaming weak-SINDy algorithm and a small amount of data samples, the full data flow could be reconstructed accurately at a low memory cost, as shown in the numerical tests.

Original languageEnglish
Pages (from-to)C207-C234
JournalSIAM Journal on Scientific Computing
Volume47
Issue number1
DOIs
StatePublished - 2025

Funding

This manuscript has been authored by UT-Battelle, LLC, under contract DEAC05-00OR22725 with the US Department of Energy (DOE). The US government retains and the publisher, by accepting the work for publication, acknowledges that the US government retains a nonexclusive, paid-up, irrevocable, world-wide license to publish or reproduce the submitted manuscript version of this work, or allow others to do so, for US government purposes. DOE will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan (http://energy.gov/downloads/doe-public-access-plan). This work was supported by the Office of Advanced Scientific Computing Research, Office of Science, US Department of Energy and performed at the Oak Ridge National Laboratory, which is managed by UT-Battelle, LLC under contract DE-AC05-00OR22725.

Keywords

  • online compression
  • proper orthogonal decomposition
  • streaming data
  • surrogate modeling

Fingerprint

Dive into the research topics of 'STREAMING COMPRESSION OF SCIENTIFIC DATA VIA WEAK-SINDY'. Together they form a unique fingerprint.

Cite this