TY - GEN
T1 - Large-scale compute-intensive analysis via a combined in-situ and co-scheduling workflow approach
AU - Sewell, Christopher
AU - Heitmann, Katrin
AU - Finkel, Hal
AU - Zagaris, George
AU - Parete-Koon, Suzanne T.
AU - Fasel, Patricia K.
AU - Pope, Adrian
AU - Frontiere, Nicholas
AU - Lo, Li Ta
AU - Messer, Bronson
AU - Habib, Salman
AU - Ahrens, James
N1 - Publisher Copyright:
© 2015 ACM.
PY - 2015/11/15
Y1 - 2015/11/15
N2 - Large-scale simulations can produce hundreds of terabytes to petabytes of data, complicating and limiting the efficiency of workflows. Traditionally, outputs are stored on the file system and analyzed in post-processing. With the rapidly increasing size and complexity of simulations, this approach faces an uncertain future. Trending techniques consist of performing the analysis in-situ, utilizing the same resources as the simulation, and/or off-loading subsets of the data to a compute-intensive analysis system. We introduce an analysis framework developed for HACC, a cosmological N-body code, that uses both in-situ and co-scheduling approaches for handling petabyte-scale outputs. We compare different analysis set-ups ranging from purely off-line, to purely in-situ to in-situ/co-scheduling. The analysis routines are implemented using the PISTON/VTK-m framework, allowing a single implementation of an algorithm that simultaneously targets a variety of GPU, multi-core, and many-core architectures.
AB - Large-scale simulations can produce hundreds of terabytes to petabytes of data, complicating and limiting the efficiency of workflows. Traditionally, outputs are stored on the file system and analyzed in post-processing. With the rapidly increasing size and complexity of simulations, this approach faces an uncertain future. Trending techniques consist of performing the analysis in-situ, utilizing the same resources as the simulation, and/or off-loading subsets of the data to a compute-intensive analysis system. We introduce an analysis framework developed for HACC, a cosmological N-body code, that uses both in-situ and co-scheduling approaches for handling petabyte-scale outputs. We compare different analysis set-ups ranging from purely off-line, to purely in-situ to in-situ/co-scheduling. The analysis routines are implemented using the PISTON/VTK-m framework, allowing a single implementation of an algorithm that simultaneously targets a variety of GPU, multi-core, and many-core architectures.
UR - http://www.scopus.com/inward/record.url?scp=84966461515&partnerID=8YFLogxK
U2 - 10.1145/2807591.2807663
DO - 10.1145/2807591.2807663
M3 - Conference contribution
AN - SCOPUS:84966461515
T3 - International Conference for High Performance Computing, Networking, Storage and Analysis, SC
BT - Proceedings of SC 2015
PB - IEEE Computer Society
T2 - International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2015
Y2 - 15 November 2015 through 20 November 2015
ER -