Abstract
Motivation: Sphagnum-dominated peatlands store a substantial amount of terrestrial carbon. The genus is undersampled and under-studied. No experimental crystal structure from any Sphagnum species exists in the Protein Data Bank and fewer than 200 Sphagnum-related genes have structural models available in the AlphaFold Protein Structure Database. Tools and resources are needed to help bridge these gaps, and to enable the analysis of other structural proteomes now made possible by accurate structure prediction. Results: We present the predicted structural proteome (25 134 primary transcripts) of Sphagnum divinum computed using AlphaFold, structural alignment results of all high-confidence models against an annotated nonredundant crystallographic database of over 90,000 structures, a structure-based classification of putative Enzyme Commission (EC) numbers across this proteome, and the computational method to perform this proteome-scale structure-based annotation.
Original language | English |
---|---|
Article number | btad511 |
Journal | Bioinformatics |
Volume | 39 |
Issue number | 8 |
DOIs | |
State | Published - Aug 1 2023 |
Funding
This work was supported by the Office of Biological and Environmental Research (BER) Genomic Science program within the US Department of Energy (DOE) Office of Science [BER DE-SC0021303, ERKP917, DOE BER Early Career Research Program]; the Oak Ridge National Laboratory, under the Laboratory Directed Research and Development Program [LDRD 09832]; the US DOE Joint Genome Institute, a DOE Office of Science User Facility [DE-AC02-05CH11231, proposal 10.46936/10.25585/60001030]; and used resources of the Oak Ridge Leadership Computing Facility, a DOE Office of Science User Facility [DE-AC05-00OR22725].