DataFed: Towards reproducible research via federated data management

Dale Stansberry, Suhas Somnath, Jessica Breet, Gregory Shutt, Mallikarjun Shankar

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

11 Scopus citations

Abstract

The increasingly collaborative, globalized nature of scientific research combined with the need to share data and the explosion in data volumes present an urgent need for a scientific data management system (SDMS). An SDMS presents a logical and holistic view of data that greatly simplifies and empowers data organization, curation, searching, sharing, dissemination, etc. We present DataFed - a lightweight, distributed SDMS that spans a federation of storage systems within a loosely-coupled network of scientific facilities. Unlike existing SDMS offerings, DataFed uses high-performance and scalable user management and data transfer technologies that simplify deployment, maintenance, and expansion of DataFed. DataFed provides web-based and command-line interfaces to manage data and integrate with complex scientific workflows. DataFed represents a step towards reproducible scientific research by enabling reliable staging of the correct data at the desired environment.

Original languageEnglish
Title of host publicationProceedings - 6th Annual Conference on Computational Science and Computational Intelligence, CSCI 2019
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages1312-1317
Number of pages6
ISBN (Electronic)9781728155845
DOIs
StatePublished - Dec 2019
Event6th Annual International Conference on Computational Science and Computational Intelligence, CSCI 2019 - Las Vegas, United States
Duration: Dec 5 2019Dec 7 2019

Publication series

NameProceedings - 6th Annual Conference on Computational Science and Computational Intelligence, CSCI 2019

Conference

Conference6th Annual International Conference on Computational Science and Computational Intelligence, CSCI 2019
Country/TerritoryUnited States
CityLas Vegas
Period12/5/1912/7/19

Funding

This research used resources of the Oak Ridge Leadership Computing Facility (OLCF) and of the Compute and Data Environment for Science (CADES) at the Oak Ridge National Laboratory, which is supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC05-00OR22725.

FundersFunder number
CADES
Data Environment for Science
U.S. Department of EnergyDE-AC05-00OR22725
Office of Science

    Keywords

    • Cross-facility
    • FAIR data principles
    • Federated identity management
    • Globus
    • Scientific data management system

    Fingerprint

    Dive into the research topics of 'DataFed: Towards reproducible research via federated data management'. Together they form a unique fingerprint.

    Cite this