TY - GEN
T1 - Tracking files in the kepler provenance framework
AU - Mouallem, Pierre
AU - Barreto, Roselyne
AU - Klasky, Scott
AU - Podhorszki, Norbert
AU - Vouk, Mladen
PY - 2009
Y1 - 2009
N2 - Workflow Management Systems (WFMS), such as Kepler, are proving to be an important tool in scientific problem solving. They can automate and manage complex processes and huge amounts of data produced by petascale simulations. Typically, the produced data need to be properly visualized and analyzed by scientists in order to achieve the desired scientific goals. Both run-time and post analysis may benefit from, even require, additional meta-data - provenance information. One of the challenges in this context is the tracking of the data files that can be produced in very large numbers during stages of the workflow, such as visualizations. The Kepler provenance framework collects all or part of the raw information flowing through the workflow graph. This information then needs to be further parsed to extract meta-data of interest. This can be done through add-on tools and algorithms. We show how to automate tracking specific information such as data files locations.
AB - Workflow Management Systems (WFMS), such as Kepler, are proving to be an important tool in scientific problem solving. They can automate and manage complex processes and huge amounts of data produced by petascale simulations. Typically, the produced data need to be properly visualized and analyzed by scientists in order to achieve the desired scientific goals. Both run-time and post analysis may benefit from, even require, additional meta-data - provenance information. One of the challenges in this context is the tracking of the data files that can be produced in very large numbers during stages of the workflow, such as visualizations. The Kepler provenance framework collects all or part of the raw information flowing through the workflow graph. This information then needs to be further parsed to extract meta-data of interest. This can be done through add-on tools and algorithms. We show how to automate tracking specific information such as data files locations.
KW - Data Provenance
KW - Data Tracking
KW - Scientific Data Management
KW - Scientific Workflows
UR - http://www.scopus.com/inward/record.url?scp=69049118434&partnerID=8YFLogxK
U2 - 10.1007/978-3-642-02279-1_21
DO - 10.1007/978-3-642-02279-1_21
M3 - Conference contribution
AN - SCOPUS:69049118434
SN - 3642022782
SN - 9783642022784
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 273
EP - 282
BT - Scientific and Statistical Database Management - 21st International Conference, SSDBM 2009, Proceedings
T2 - 21st International Conference on Scientific and Statistical Database Management, SSDBM 2009
Y2 - 2 June 2009 through 4 June 2009
ER -