Compression of tokamak boundary plasma simulation data using a maximum volume algorithm for matrix skeleton decomposition

Sebastian De Pascuale, Kenneth Allen, David L. Green, Jeremy D. Lore

Research output: Contribution to journalArticlepeer-review

1 Scopus citations

Abstract

This report demonstrates satisfactory data compression of SOLPS-ITER simulation output ranging from 2D fields, 1D profiles, and 0D scalar variables with a novel matrix decomposition approach. The singular value decomposition (SVD) scales poorly for large matrix sizes and is unsuited to the application on high dimensional data common to fusion plasma physics simulation. We employ the columns-submatrix-rows (CUR) matrix factorization technique in order to compute a low-rank approximation up to two orders of magnitude faster than the SVD, but within a nominal L2-norm relative error of ϵ=10−2. In addition, the CUR approach maintains the original format of the data, in its extracted columns and rows, allowing for interpretable data storage at the original resolution of the simulation. We utilize an iterative algorithm to compute the CUR decomposition of simulation output by maximizing the volume, or linearly independent information content, of a low-rank submatrix contained within the data. Experiments over n×n randomized test matrices with embedded rank-deficient features show that this maximum volume implementation of CUR matrix approximation has reduced asymptotic computational complexity on the order of n compared to the SVD, which scales approximately as n3. These results show that the CUR technique can be used to effectively select time step snapshots (columns) of over 140 SOLPS-ITER output variables and the associated discretized coordinate timeseries (rows) allowing for reconstruction of the complete simulation dynamics.

Original languageEnglish
Article number112089
JournalJournal of Computational Physics
Volume484
DOIs
StatePublished - Jul 1 2023

Funding

This work is supported in part by the US DOE under contract DE-AC05-00OR22725 . Dr. Kenneth Allen was supported in part by the DOE Office of Science Graduate Student Research (SCGSR) program, and under thesis advisement by Dr. Ming-Jun Lai. The SCGSR program is administered by the Oak Ridge Institute for Science and Education (ORISE) and managed by the Oak Ridge Associated Universities (ORAU) for the DOE under contract number DE-SC0014664 . This research used resources of the Oak Ridge Leadership Computing Facility at the Oak Ridge National Laboratory, which is supported by the U.S. Department of Energy Office of Science under Contract No. DE-AC05-00OR22725 . This work is supported in part by the US DOE under contract DE-AC05-00OR22725. Dr. Kenneth Allen was supported in part by the DOE Office of Science Graduate Student Research (SCGSR) program, and under thesis advisement by Dr. Ming-Jun Lai. The SCGSR program is administered by the Oak Ridge Institute for Science and Education (ORISE) and managed by the Oak Ridge Associated Universities (ORAU) for the DOE under contract number DE-SC0014664. This research used resources of the Oak Ridge Leadership Computing Facility at the Oak Ridge National Laboratory, which is supported by the U.S. Department of Energy Office of Science under Contract No. DE-AC05-00OR22725. This manuscript has been authored by UT-Battelle, LLC, under Contract No. DE-AC05-00OR22725 with the U.S. Department of Energy (DOE). The publisher acknowledges the U.S. government license to provide public access under the DOE Public Access Plan (http://energy.gov/downloads/doe-public-access-plan). This manuscript has been authored by UT-Battelle, LLC, under Contract No. DE-AC05-00OR22725 with the U.S. Department of Energy (DOE). The publisher acknowledges the U.S. government license to provide public access under the DOE Public Access Plan ( http://energy.gov/downloads/doe-public-access-plan ).

FundersFunder number
DOE Office of Science Graduate Student Research
DOE Public Access Plan
SCGSR
U.S. Department of EnergyDE-AC05-00OR22725, DE-SC0014664
Office of Science
Oak Ridge Associated Universities
Oak Ridge Institute for Science and Education

    Keywords

    • CUR matrix decomposition
    • Data compression
    • Dimensionality reduction
    • Low-rank matrix approximation
    • SOLPS-ITER
    • Scrape-off-layer

    Fingerprint

    Dive into the research topics of 'Compression of tokamak boundary plasma simulation data using a maximum volume algorithm for matrix skeleton decomposition'. Together they form a unique fingerprint.

    Cite this