TY - CONF
T1 - Active workflow system for Near Real-Time extreme-scale science
AU - Zhang, Yanwei
AU - Liu, Qing
AU - Klasky, Scott
AU - Wolf, Matthew
AU - Schwan, Karsten
AU - Eisenhauer, Greg
AU - Choi, Jong
AU - Podhorszki, Norbert
PY - 2014
Y1 - 2014
N2 - In recent years, streaming-based data processing has been gaining substantial traction for dealing with overwhelming data generated by real-time applications, from both enterprise sources and scientific computing. In this work, however, we look at an emerging class of scientific data with Near Real-Time (NRT) requirement, in which data is typically generated in a bursty fashion with the near real-time constraints being applied primarily between bursts, rather than within a stream. A key challenge for this types of data sources is that the processing time per data element is not uniform, and not always feasible to predict. Given the observations on the increasing unpredictability of compute load and system dynamics, this work looks to adapt streamingbased approach to the context of this new class of large experiments and simulations that have complex run-time control and analysis issues. In particular, we deploy a novel two-tier scheme for handling the increasing unpredictability of runtime behaviors: Instead of relying on determining what and where to run the scientific workflows beforehand or partial dynamically, the decision will also be adaptively enhanced online according to system runtime status. This is enabled by embedding workflow along with data streams. Specifically, we break data outputs generated from experiments or simulations into multiple self-describing "chunks", which we call active data objects. As such, if there is a transient hotspot observed, a data object with unfinished workflow pipeline can break its previous schedule and search for a least loaded location to continue the execution. Our preliminary experiment results based on synthetic workloads demonstrate the proposed active workflow system as a very promising solution by outperforming the state-of-the-art semi-dynamic workflow schedulers with an improved workflow completion time, as well as a good scalability.
AB - In recent years, streaming-based data processing has been gaining substantial traction for dealing with overwhelming data generated by real-time applications, from both enterprise sources and scientific computing. In this work, however, we look at an emerging class of scientific data with Near Real-Time (NRT) requirement, in which data is typically generated in a bursty fashion with the near real-time constraints being applied primarily between bursts, rather than within a stream. A key challenge for this types of data sources is that the processing time per data element is not uniform, and not always feasible to predict. Given the observations on the increasing unpredictability of compute load and system dynamics, this work looks to adapt streamingbased approach to the context of this new class of large experiments and simulations that have complex run-time control and analysis issues. In particular, we deploy a novel two-tier scheme for handling the increasing unpredictability of runtime behaviors: Instead of relying on determining what and where to run the scientific workflows beforehand or partial dynamically, the decision will also be adaptively enhanced online according to system runtime status. This is enabled by embedding workflow along with data streams. Specifically, we break data outputs generated from experiments or simulations into multiple self-describing "chunks", which we call active data objects. As such, if there is a transient hotspot observed, a data object with unfinished workflow pipeline can break its previous schedule and search for a least loaded location to continue the execution. Our preliminary experiment results based on synthetic workloads demonstrate the proposed active workflow system as a very promising solution by outperforming the state-of-the-art semi-dynamic workflow schedulers with an improved workflow completion time, as well as a good scalability.
KW - Distributed workflow scheduler
KW - Load balancing
KW - Near real-time science
KW - Scientific workflow system
KW - Stream processing
KW - System dynamics
UR - http://www.scopus.com/inward/record.url?scp=84897508694&partnerID=8YFLogxK
U2 - 10.1145/2567634.2567637
DO - 10.1145/2567634.2567637
M3 - Paper
AN - SCOPUS:84897508694
SP - 53
EP - 61
T2 - 2014 1st Workshop on Parallel Programming for Analytics Applications, PPAA 2014
Y2 - 16 February 2014 through 16 February 2014
ER -