Multi-Omics integration can be used to rescue metabolic information for some of the dark region of the Pseudomonas putida proteome

Steven Tavis, Robert L. Hettich

Research output: Contribution to journalArticlepeer-review

Abstract

In every omics experiment, genes or their products are identified for which even state of the art tools are unable to assign a function. In the biotechnology chassis organism Pseudomonas putida, these proteins of unknown function make up 14% of the proteome. This missing information can bias analyses since these proteins can carry out functions which impact the engineering of organisms. As a consequence of predicting protein function across all organisms, function prediction tools generally fail to use all of the types of data available for any specific organism, including protein and transcript expression information. Additionally, the release of Alphafold predictions for all Uniprot proteins provides a novel opportunity for leveraging structural information. We constructed a bespoke machine learning model to predict the function of recalcitrant proteins of unknown function in Pseudomonas putida based on these sources of data, which annotated 1079 terms to 213 proteins. Among the predicted functions supplied by the model, we found evidence for a significant overrepresentation of nitrogen metabolism and macromolecule processing proteins. These findings were corroborated by manual analyses of selected proteins which identified, among others, a functionally unannotated operon that likely encodes a branch of the shikimate pathway.

Original languageEnglish
Article number267
JournalBMC Genomics
Volume25
Issue number1
DOIs
StatePublished - Dec 2024

Funding

Funding was provided by the BioEnergy Science Center and the Center for Bioenergy Innovation at ORNL, both supported by the U.S. Department of Energy (DOE) Office of Biological and Environmental Research in the DOE Office of Science. Oak Ridge National Laboratory is managed by University of Tennessee-Battelle LLC for the Department of Energy under contract DOE-AC05-00OR22725.

FundersFunder number
U.S. Department of EnergyDOE-AC05-00OR22725
Biological and Environmental Research
Office of Science
University of Tennessee-Battelle LLC
U.S. Department of Energy
BioEnergy Science Center
Center for Bioenergy Innovation at ORNL

    Keywords

    • Function prediction
    • Gene ontology
    • Machine learning
    • Multi-omics integration
    • Proteins of unknown function
    • Pseudomonas putida

    Fingerprint

    Dive into the research topics of 'Multi-Omics integration can be used to rescue metabolic information for some of the dark region of the Pseudomonas putida proteome'. Together they form a unique fingerprint.

    Cite this