The National Microbiome Data Collaborative Data Portal: An integrated multi-omics microbiome data resource

Emiley A. Eloe-Fadrosh, Faiza Ahmed, Anubhav, Michal Babinski, Jeffrey Baumes, Mark Borkum, Lisa Bramer, Shane Canon, Danielle S. Christianson, Yuri E. Corilo, Karen W. Davenport, Brandon Davis, Meghan Drake, William D. Duncan, Mark C. Flynn, David Hays, Bin Hu, Marcel Huntemann, Julia Kelliher, Sofya LebedevaPo E. Li, Mary Lipton, Chien Chi Lo, Stanton Martin, David Millard, Kayd Miller, Mark A. Miller, Paul Piehowski, Elais Player Jackson, Samuel Purvine, T. B.K. Reddy, Rachel Richardson, Marisa Rudolph, Setareh Sarrafan, Migun Shakya, Montana Smith, Kelly Stratton, Jagadish Chandrabose Sundaramurthi, Pajau Vangay, Donald Winston, Elisha M. Wood-Charlson, Yan Xu, Patrick S.G. Chain, Lee Ann McCue, Douglas Mans, Christopher J. Mungall, Nigel J. Mouncey, Kjiersten Fagnan

Research output: Contribution to journalArticlepeer-review

30 Scopus citations

Abstract

The National Microbiome Data Collaborative (NMDC) Data Portal (https://data.microbiomedata.org) supports microbiome multi-omics data exploration and access through an integrated, distributed data framework aligned with the FAIR (Findable, Accessible, Interoperable and Reusable) data principles (1). The NMDC Data Portal currently hosts 10.2 terabytes of multi-omics microbiome data, spanning five data types (metagenomes, metatranscriptomes, metaproteomes, metabolomes, and natural organic matter characterizations), generated at two Department of Energy User Facilities, the Joint Genome Institute (JGI) at Lawrence Berkeley National Laboratory (LBNL) and the Environmental Molecular Systems Laboratory (EMSL) at Pacific Northwest National Laboratory (PNNL). A flexible data schema (https://github.com/microbiomedata/nmdc-schema) leveraging community-driven standards underpins how data is managed and integrated. Annotated multi-omic data products are produced by the NMDC workflows and linked through common biosamples to enable search capabilities based on environmental context, instrumentation, and functional attributes. As a pilot system, the NMDC Data Portal offers download capabilities and several search components, including interactive geographic visualization of samples; environmental classification distribution visualized through an interactive Sankey diagram; time-series slider to select longitudinal samples of interest; and an upset plot displaying the number of multi-omics data generated from the same biosample within a study.

Original languageEnglish
Pages (from-to)D828-D836
JournalNucleic Acids Research
Volume50
Issue numberD1
DOIs
StatePublished - Jan 7 2022

Fingerprint

Dive into the research topics of 'The National Microbiome Data Collaborative Data Portal: An integrated multi-omics microbiome data resource'. Together they form a unique fingerprint.

Cite this