Genome Extraction from Shotgun Metagenome Sequence Data

  • Dylan Chivian (Creator)
  • Sean P. Jungbluth (Creator)
  • Paramvir Dehal (Creator)
  • Richard S. Canon (Creator)
  • Benjamin Allen (Creator)
  • Mikayla M. Clark (Creator)
  • Tianhao Gu (Creator)
  • Miriam L. Land (Creator)
  • Gavin A. Price (Creator)
  • William J. Riehl (Creator)
  • Michael W. Sneddon (Creator)
  • Roman A. Sutormin (Creator)
  • Qizhi Zhang (Creator)
  • Robert Cottingham Jr (Creator)
  • Christopher S. Henry (Creator)
  • Adam P. Arkin (Creator)

Dataset

Description

Uncultivated Bacteria and Archaea comprise the vast majority of species on Earth, but obtaining their genomes directly from the environment, using shotgun sequencing, has only recently become possible. To realize the hope of capturing Earth’s microbial genetic complement, technologies that accelerate recovery of high-quality genomes are necessary. We present a series of analysis steps and data products for the extraction of high quality metagenome-assembled genomes (MAGs) from microbiomes using the U.S. Department of Energy Systems Biology Knowledgebase (KBase) platform (http://www.kbase.us/). In KBase, the process is end-to-end, allowing a user to go from the initial sequencing reads all the way through to MAG genomes, which can then be analyzed with other KBase capabilities such as phylogenetic placement, functional assignment, metabolic modeling, pangenome functional profiling, RNA-Seq, and others. While portions of such capabilities are individually available from other resources, the combination of the intuitive usability, data interoperability, and integration of tools in a freely available compute resource makes KBase a uniquely powerful platform for obtaining MAGs from microbiomes. While this workflow offers tools for each of the key steps in the genome extraction process, it also provides a scaffold that can be easily extended, with additional MAG recovery and analysis tools, via the KBase SDK (Software Development Kit).
Date made availableNov 9 2021
PublisherDOE Systems Biology Knowledgebase (KBase)

Funding

DE-AC02-05CH11231, DE-AC02-06CH11357, DE-AC05-00OR22725, and DE-AC02-98CH10886

Cite this