Near real-time streaming analysis of big fusion data

R. Kube, R. M. Churchill, C. S. Chang, J. Choi, R. Wang, S. Klasky, L. Stephey, E. Dart, M. J. Choi

Research output: Contribution to journalArticlepeer-review

3 Scopus citations

Abstract

Experiments on fusion plasmas produce high-dimensional data time series with ever-increasing magnitude and velocity, but turn-around times for analysis of this data have not kept up. For example, many data analysis tasks are often performed in a manual, ad-hoc manner some time after an experiment. In this article, we introduce the Delta framework that facilitates near real-time streaming analysis of big and fast fusion data. By streaming measurement data from fusion experiments to a high-performance compute center, Delta allows computationally expensive data analysis tasks to be performed in between plasma pulses. This article describes the modular and expandable software architecture of Delta and presents performance benchmarks of individual components as well as of an example workflow. Focusing on a streaming analysis workflow where electron cyclotron emission imaging (ECEi) data is measured at KSTAR on the National Energy Research Scientific Computing Center's (NERSC's) supercomputer we routinely observe data transfer rates of about 4 Gigabit per second. In NERSC, a demanding turbulence analysis workflow effectively utilizes multiple nodes and graphical processing units and executes them in under 5 min. We further discuss how Delta uses modern database systems and container orchestration services to provide web-based real-time data visualization. For the case of ECEi data we demonstrate how data visualizations can be augmented with outputs from machine learning models. By providing session leaders and physics operators, results of higher-order data analysis using live visualizations may make more informed decisions on how to configure the machine for the next shot.

Original languageEnglish
Article number035015
JournalPlasma Physics and Controlled Fusion
Volume64
Issue number3
DOIs
StatePublished - Mar 2022

Funding

This material is based upon work supported by the U.S. Department of Energy, Office of Science, Office of Fusion Energy Sciences, under Contract No. AC02-09CH11466. This research used resources of the National Energy Research Scientific Computing Center (NERSC), a U.S. Department of Energy Office of Science User Facility located at Lawrence Berkeley National Laboratory, operated under Contract No. DE-AC02-05CH11231. This work was supported by R&D Programs of ‘KSTAR Experimental Collaboration and Fusion Plasma Research (EN2101-12)’. Delta is available on github 17

Keywords

  • big data
  • fusion energy
  • machine learning
  • streaming analysis

Fingerprint

Dive into the research topics of 'Near real-time streaming analysis of big fusion data'. Together they form a unique fingerprint.

Cite this