Abstract
Fusion energy experiments and simulations provide critical information needed to plan future fusion reactors. As next-generation devices like ITER move toward long-pulse experiments, analyses, including AI and ML, should be performed in a wide range of time and computing constraints, from near-real-time constraints, between-shot analysis, and to campaign-wide long-term analysis. However, the data volume, velocity, and variety make it extremely challenging for analyses using only local computational resources. Researchers need the ability to compose and execute workflows spanning edge resources to large-scale highperformance computing facilities. We present Delta, a system to address data analysis challenges, including AI/ML, in fusion science, by leveraging the ADIOS I/O library and middleware, to support executing science workflows over the wide area network for near-real-time streaming. We discuss the data federation challenges in performing remote workflows, focusing on on-going research work in (1) managing, reducing, and streaming data to minimize I/O and data movement overheads, (2) decompressing and reorganizing data for analysis, and (3) executing workflows for automated data analysis. We introduce examples for deep-learning based data analysis for the fusion domain and demonstrate how we use Delta to construct end-to-end workflows for a fusion device in Korea, connecting a remote DOE facility in the USA. The capability demonstrated by this project is the basis for improving the state of the art for near-real-time data federation amongst remote facilities.
Original language | English |
---|---|
Title of host publication | Driving Scientific and Engineering Discoveries Through the Convergence of HPC, Big Data and AI - 17th Smoky Mountains Computational Sciences and Engineering Conference, SMC 2020, Revised Selected Papers |
Editors | Jeffrey Nichols, Arthur ‘Barney’ Maccabe, Suzanne Parete-Koon, Becky Verastegui, Oscar Hernandez, Theresa Ahearn |
Publisher | Springer Science and Business Media Deutschland GmbH |
Pages | 285-299 |
Number of pages | 15 |
ISBN (Print) | 9783030633929 |
DOIs | |
State | Published - 2021 |
Event | 17th Smoky Mountains Computational Sciences and Engineering Conference, SMC 2020 - Virtual, Online Duration: Aug 26 2020 → Aug 28 2020 |
Publication series
Name | Communications in Computer and Information Science |
---|---|
Volume | 1315 CCIS |
ISSN (Print) | 1865-0929 |
ISSN (Electronic) | 1865-0937 |
Conference
Conference | 17th Smoky Mountains Computational Sciences and Engineering Conference, SMC 2020 |
---|---|
City | Virtual, Online |
Period | 08/26/20 → 08/28/20 |
Funding
Acknowledgement. This research was supported by the Department of Energy’s SciDAC RAPIDS Institute and the HBPS SciDAC Partnership, as well as the Exascale Computing Project (17-SC-20-SC), a collaborative effort of U.S. Department of Energy Office of Science and the National Nuclear Security Administration. This material is based upon work supported by the U.S. Department of Energy, Office of Science, Office of Fusion Energy Sciences, under AC02-09CH11466. This research used resources of the Argonne and Oak Ridge Leadership Computing Facilities, DOE Office of Science User Facilities supported under Contracts DE-AC02-06CH11357 and DE-AC05-00OR22725, respectively, as well as the National Energy Research Scientific Computing Center (NERSC), a U.S. Department of Energy Office of Science User Facility operated under Contract No. DE-AC02-05CH11231. The research at KSTAR was conducted as part of KSTAR R&D Program of National Fusion Research Institute of Korea (EN2001-11). We present Delta, a system to address data analysis challenges, including AI/ML, in fusion science, by leveraging the ADIOS I/O library and middleware, to support executing science workflows over the wide area network for near-real-time streaming. We discuss the data federation challenges in performing remote workflows, focusing on on-going research work in (1) managing, reducing, and streaming data to minimize I/O and data movement overheads, (2) decompressing and reorganizing data for analysis, and (3) executing workflows for automated data analysis. We introduce examples for deep-learning based data analysis for the fusion domain and demonstrate how we use Delta to construct end-to-end workflows for a fusion device in Korea, connecting a remote DOE facility in J. Choi et al.—Contributed Equally. This manuscript has been co-authored by UT-Battelle, LLC, under contract DE-AC05-00OR22725 with the US Department of Energy (DOE). The US government retains and the publisher, by accepting the article for publication, acknowledges that the US government retains a nonexclusive, paid-up, irrevocable, worldwide license to publish or reproduce the published form of this manuscript, or allow others to do so, for US government purposes. DOE will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan (http://energy. gov/downloads/doe-public-access-plan). This research was supported by the Department of Energy?s SciDAC RAPIDS Institute and the HBPS SciDAC Partnership, as well as the Exascale Computing Project (17-SC-20-SC), a collaborative effort of U.S. Department of Energy Office of Science and the National Nuclear Security Administration. This material is based upon work supported by the U.S. Department of Energy, Office of Science, Office of Fusion Energy Sciences, under AC02-09CH11466. This research used resources of the Argonne and Oak Ridge Leadership Computing Facilities,DOE Office of Science User Facilities supported under Contracts DE-AC02-06CH11357 and DE-AC05-00OR22725, respectively, as well as the National Energy Research Scientific Computing Center (NERSC), a U.S. Department of Energy Office of Science User Facility operated under Contract No. DE-AC02-05CH11231. The research at KSTAR was conducted as part of KSTAR R&D Program of National Fusion Research Institute of Korea (EN2001-11).
Keywords
- Data federation
- Data streams
- Fusion
- Remote data analysis