Scaling SQL to the Supercomputer for Interactive Analysis of Simulation Data

Jens Glaser, Felipe Aramburú, William Malpica, Benjamín Hernández, Matthew Baker, Rodrigo Aramburú

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

AI and simulation workloads consume and generate large amounts of data that need to be searched, transformed and merged with other data. With the goal of treating data as a first-class citizen inside a traditionally compute-centric HPC environment, we explore how the use of accelerators and high-speed interconnects can speed up tasks which otherwise constitute bottlenecks in computational discovery workflows. BlazingSQL is SQL engine that runs natively on NVIDIA GPUs and supports internode communication for fast analytics on terabyte-scale tabular data sets. We show how a fast interconnect improves query performance if leveraged through the Unified Communication X (UCX) middleware. We envision that future computing platforms will integrate accelerated database query capabilities for immediate and interactive analysis of large simulation data.

Original languageEnglish
Title of host publicationDriving Scientific and Engineering Discoveries Through the Integration of Experiment, Big Data, and Modeling and Simulation - 21st Smoky Mountains Computational Sciences and Engineering, SMC 2021, Revised Selected Papers
Editors[given-name]Jeffrey Nichols, [given-name]Arthur ‘Barney’ Maccabe, James Nutaro, Swaroop Pophale, Pravallika Devineni, Theresa Ahearn, Becky Verastegui
PublisherSpringer Science and Business Media Deutschland GmbH
Pages327-339
Number of pages13
ISBN (Print)9783030964979
DOIs
StatePublished - 2022
Event21st Smoky Mountains Computational Sciences and Engineering Conference, SMC 2021 - Virtual, Online
Duration: Oct 18 2021Oct 20 2021

Publication series

NameCommunications in Computer and Information Science
Volume1512 CCIS
ISSN (Print)1865-0929
ISSN (Electronic)1865-0937

Conference

Conference21st Smoky Mountains Computational Sciences and Engineering Conference, SMC 2021
CityVirtual, Online
Period10/18/2110/20/21

Funding

This manuscript has been authored by UT-Battelle, LLC, under contract DE-AC05-00OR22725 with the US Department of Energy (DOE). The US government retains and the publisher, by accepting the article for publication, acknowledges that the US government retains a nonexclusive, paid-up, irrevocable, worldwide license to publish or reproduce the published form of this manuscript, or allow others to do so, for US government purposes. DOE will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan (http://energy. gov/downloads/doe-public-access-plan). We are grateful to Oscar Hernandez (NVIDIA) for initial conceptualization of this research. We thank Arjun Shankar (ORNL) for support. This research used resources of the Oak Ridge Leadership Computing Facility (OLCF) at the Oak Ridge National Laboratory, which is supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC05-00OR22725. Acknowledgments. We are grateful to Oscar Hernandez (NVIDIA) for initial conceptualization of this research. We thank Arjun Shankar (ORNL) for support. This research used resources of the Oak Ridge Leadership Computing Facility (OLCF) at the Oak Ridge National Laboratory, which is supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC05-00OR22725.

Fingerprint

Dive into the research topics of 'Scaling SQL to the Supercomputer for Interactive Analysis of Simulation Data'. Together they form a unique fingerprint.

Cite this